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PREFACE 


This document, developed for the Director of Safety (KSC-SF) 
at the John F. Kennedy Space Center, is a handbook for the 
preparation of System Safety Engineering Analyses. It pro- 
vides a general overview of system elements which are possible 
subjects for system safety studies, and suggests recommended 
methods of analysis for the various study areas and types of 
safety problems that may arise. The kind and form of output 
Gita -nc information which safety studies should provide are 
identified. Section 4 provides a summary of the basic met:.: is 
of analysis and assessment; these discussions are amplified in 
tbs appendices for those who require more detail regarding 
suitable applications, data requirements, background and theory 
of each method, and the type of conclusions that each method is 
capable of providing. Credit for much of the material in this 
handbook is due the authors of the references in Section 5, 
since these provided much of the information contained herein. 
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1.0 INTRODUCTION 

Engineering development oi’ ;* system requires systematic 
identification and solution of Eafoty problems w!n -rise 
from hazard potentials in the system. This proo. ■ u j. • iti- 
fication and solution frequently requires syst bui'et., 
engineering analysis of specific systems ar.d There 

are a variety of methods and techniques that have been developed 
for, or are particularly apt to system safety study. These 
techniques enable the performance of system safety engineering 
analyses ,und when integrated with tot:d system engineering, 
contribute to equipment designs and operations which satisfy 
system safety requirements without compromising total systen 
performance. 

1.1 PURPOSE 

The purpose of this document is to guide engineering specialists 
in the conduct of system safety engineering studies, ar.d tc 
provide criteria for the control of such studies in a cost 
effective manner. 


I 


In many projects, lack of early planning of system safety is 
the principal reason for the lack of true cost effectiveness 
in system safety. Historically new systems have been conceived 
for u primary. mission and excluded secondary considerations such 
as safety, and reliability. There is generally little or no 
budgetary consideration given to the safety aspect of systems, 
engineering in the conceptual stage. During the developmental 
and early operational phase most safety problems occur ar.d are 
8ol\ id by ’‘brinkmanship". That is, allowing them to become 
potentially serious problems, and then forging a fix for each. 
•This approach lacks the unity of concept fundamental to good 
cost effectiveness. 

Safety engineering after-the-fact proves to be costly, issues 
become confused and often the fix is abandoned due to trade-offs 
against schedule impact. This pendulum of unmodulated under- 
awareness to the problem and over-reaction can be controlled by 
the application of sound systems safety engineering during the 
conceptual or development; J phase. 

1.2 SCOPE 


This document provides a general overview of system elements 
or functions which are possible subjects for system safety 
study. It identifies information and output data that a safety 
study should provide in order to support management decisions 
with respect to system safety. Most important, it identifies 
and describes a variety of analytic techniques which are applic- 
-:'le to system safety probV.us. For a.. oh technique described, 
t “.3 re is a' discussion of suitable applications, input, data 
requirements, operational steps in application, and the kind and 
•quality of conclusions that nay bo drawn. 



USE FOR TYPEWRITTEN MATERIAL ONLY 


tmi 




* \ 




a 

k, 

> 

i 


1.2 


1.3 


1.4 


1.5 


COMPANY 


NUMBER D2-1 19062-1 
REV LTR 


(Continued) 

Selected technical references are cited and technical 
appendices are Included to identify or provide more 
detailed information for the user. 

OBJECTIVES 

The objective of this document is to provide guidelines for 
system safety engineering analysis, that will allow NASA to 
achieve standardization and uniformity of the overall approach 
to "safety" by its various support contractors. 

This document also provides the engineering analyst with a 
selection of analytic tools, with instruction in their applic- 
ation, to facilitate the requirement of paragraph 1.5, by use 
of the techniques defined in Section 4. 

SYSTEM SAFETY ANALYSIS PHILOSOPHY 

Operational systems have and continue to have safety deficiencies 
inadvertently designed into them. The best way to resolve safety 
hazards i3 to design them out of the system. This may be 
achieved by conducting a thorough system safety analysis con- 
sidering the possible trade-offs between various design alter- 
natives. The philosophy dictating these analyses usually takes 
one of three approaches. The first approach asks the question: 
What degree of safety can be achieved from the minimum - nse? 
The second: What maximum degree of safety can be ac’ for 

a preselected expenditure? The third: What minima use is 

required to achieve a preselected safety level? Wi third 

approach, caution must be exercised for it is possi. chat the 
most effective course of action provides a higher level of 
safety at a lower expense than the preselected safety level. 

Inherent in the role of system safety is the responsibility of 
properly identifying and eliminating accident causes before they 
occur. It is a fact that behind most accidents there is a 
cause that can be identified and eliminated. 

NASA SAFETY DIRECTION 

The Office of Manned Space Flight (OMSF) has issued guidelines 
concerning the application of system safety principles to all 
manned space flight programs. The following is an extract from 
a letter, Subject: Implementation and Conduct of NASA System 

Safety Activities, du.ted Jaiy 24, 1968, and signed by the 
Director of Safety (DY): 

"This is to common i c:.te the desired approach in the 
conduct of system safety activities and to clearly 
delineate the results expected. 


ji 
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PURPOSE 

"The purpose of system safety activities (like all 
safety activities) is the avoidance of injury to 
people md the avoidance of property loss (Including 
flight hardware) to the maximum practical extent. 

HSIC APPROACH 

"Similar to other NASA safety activities, system 
safety requires a basic approach as follows: 

1. Know the hazardous characteristics of the system 
( including the total environment). Specifically, this 
means hazards to people and property (including flight 
hardware). 

2. Eliminate, insofar as possible, these hazards. 

If the hazards cannot be elx.—iated, take all practical 
steps to control them. These steps include bdh hardware 
and software considerations. 

3. Identify the risks remaining as inherent in the 
system, its processing end its operation either in (l) normal 
modes or (2) out of tolerance modes brought about by failures 
or combinations of failures. These risks are the risks to 
people and property (including flight hardware). 

4. «■ Assure that the knowledge of residual risks identified 
is applied to ihc programmatic decision-making process. 

5. Recognize that the management responsibility for 
achieving system safety flows along program organizational 
lines. 

6. Be: r in mind that the desired results from system 
safety activities are the minimizing of risks to the 
maximum practical extent and the application of the know- 
ledge o: these rishs to management necisions. Also, assure 
an understanding at all management levels as to the risks 
bei’.j Incurred uy testing, transporting or operating the 
system or portions of the system. 


all systems processing activities, through conduct of 


« 

i 


t 




SHEET 1-3 




USE FOR TYPEWRITTEN MATERIAL ONLY 


NUMBER D2-1 19062-1 
REV LTR 


‘3% 


»-«'• 4 '*'^ 




\ ' • 








COMPANY 


1.5 (Continued) 

WHERE SYSTEM SAFETY ACTIVITIES ARE REQUIRED 

"System safety activities are required in all NASA space 
hardware programs, manned and unmanned, to assure protection 
of people and property from system flight hardware effects 
from design inception, through all systems processing 
activities, through conduct of the mission and including 
post-mission activities insofar as hazards arising from the 
mission may require. 

WHERE SYSTEM SAFETY ACTIVITIES ARE SUGGESTED 

"The philosophy, techniques and tools of the system safety 
approach are recommended , as applicable in: complicated 

industrial safety situations, complex laboratory operations, 
aircraft research, and other research activities. 

WHY THE SYSTEM SAFETY APPROACH 

"The reason for an organized NASA, system safety approach 
include the following: 

1. The complexity of systems, subsystems and components 
under extreme and complex conditions of environment and 
application. The inherent complexity of the NASA flight 
hardware systems demands analytical techniques of consider- 
able sophistication in order to achieve problem identifica- 
tion and solution. 

2. The need to fix considerable attention on the safety 

considerations arising out of total systems effects, where 
such effects cannot be discovered when considering portions 
of the system independently. tWf&fy 

3. The subtleties inherent in the dynamic characteristics 
of flight hardware systems. 

4. The need to assure that the safety aspects of the 
mission under normal conditions and under mission failure 
conditions are adequate. 

5. The need to assure that system safety measures at all 
stops leading up uo and after the mission are adequate. 
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KOW TO IMPLEMENT SYSTEM SAFETY ACTIVITIES 


"Successful and, therefore, satisfactory conduct of 
system safety activities include the following points 
of approach: 

1. Personnel assigned in system safety work are to be — 

a. Qualified to conduct the work 

b. Assigned, exclusively, to the system safety mission 

c. Organizationally placed to assure effectiveness. 

2. Analytical techniques appropriate to the situation Eire 
to be use. 

3. System safety is to take advantage of all useful inputs." 


**** 

It is quite obvious from the above quotation that NASA 
management recognizes the need for a systematic analytic 
approach to system safety engineering. This document attempts 
to formalize the KSC-SF implementaion of the above requirements. 
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2.0 SELECTION OF METHOD 

The system which confronts the analyst may vary considerably in 
complexity from one assessment to the next. Whether the scope 
of analysis encompasses an entire manned spaceflight center such 
as KSC, or whether it is limited to one component such as a valve 
or relay, the "system" approach is equally valid. The safe 
development and use of a system involves many managerial, 
engineering, manufacturing and operational disciplines, regard- 
less of whether that system is a complete launch facility or an 
individual device used on that facility. Application of the 
systems approach assures that the requirements and objectives of 
the system "user" will be realized in the safest and most econ- 
omical manner the state of technology will allow. The usefulness 
of the systems approach increases as the complexity of the problem 
to be solved increases. Therefore, KSC Safety management must 
select from among the various methods of system analysis available 
that which is required to satisfy the safety problem posed. 

For example , the question may be asked, "What is the numerical 
probability that death will be incurred by operational personnel 
during all phases of assembly, test and checkout of the Space 
Vehicle for Mission X?" Answering that question requires a 
complex detailed quantitative analysis spanning many facilities 
and agencies. 

Another example : A question of quite different character may 

be asked of the system safety analyst. "What specific risks 
to equipment and men must be avoided during the operation of 
hypergolic propellant transfer unit, number abc, during Space- 
craft loading at the launch facility?" This question is not 
only much smaller in scope and complexity, but suggests a qual- 
itative analysis. Relative probabilities may be useful for 
assessment formulation and critical risk identification, but 
the absolute statistical analysis required to answer the question 
in the first example is not necessary or even desirable because 
of the undesirable costs of "over analysis," 

When system sr-fety engineers are required to perform analyses 
at the same time that the system design is developing, the 
system managers may not provide specific questions to be 
answer wd, but will still require a complete assessment of the 
level of safety allowed by the proposed design. Maximum 
benefit is derived from analyses conducted during design phases 
because alternatives and tread-offs can be compared for optimal 
safety, and the best solution can be incorporated in the fin "l 
system design without expensive mooification to the completed 
system. 
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2,0 (Continued) 

♦ 

The degree of system design definition available to the analyst 
may dictate the method of analysis. It is impossible to con- 
struct a Failure Mode and Effects Analysis (FMEA), or much of 
a fault tree when only the basic scheme for the system is known. 

A Gross Hazards Analysis, as defined in paragraph 4.1 , completed 
in time may demonstrate that some other design concept is 
essential if a high degree of safety is to be obtained. Gross 
Hazards Analysis provides a quick method for the system safety 
engineer to apply experience from detailed analyses conducted 
for other systems which have a reasonable degree of similarity 
to the proposed system design concept. 

The extent and detail of the safety analysis required early in 
the program is largely dependent on the complexity of the system 
to be analyzed and the desired accuracy of the answer, and this 
will indicate the best analytical method to be used. 

The difficulty of matching the size of the analytical effort 
to efficiently provide the required visibility of risk, can be , 
solved in successive steps. If sufficient time is allowed the 
analyst, a preliminary analysis may be conducted to predict the 
best analytical meohod to use for the formal analysis to follow. 

The preliminary analysis to be performed should at least consider: 

(1 ) The contractual or binding system safety requirements. 

How accurately must safety be measured? A high degree 
of accuracy implies a detailed, quantitative analysis. 
Minimum allowable accident probabilities may be explicit 
in the contract. 

(2) How hazardous does the system seem? Does the system 
require a large or close man-machine interface? Are 

high energies stored in the system? Are weight or structural 
criteria such that normal safety factors must be reduced? 

Is the system operated in environments for which it was not 
designed? Are subsystems required to protect man and 
machine from severe n nviror.rents? Affirmative answers 
imply highly hazardous systems. 

(3) What level of technology is required to design and build 
the system relative to the staue-of-the-ar o? New ideas 
and v- ; ys of solving system design problems frequently 
imply an unusual element of risk. 
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2.0 (Continued) 

(4) What level of technological skill is required to operate 
the completed system relative to estimated present skill 
levels of the user? A new type of system which requires 
the user to learn new skills, beyond merely acquiring 
systems familiarization, implies that he will also need 
to be aware of the new risks inherent in the system in 
more detail than users who have already mastered the 
required skills. 

(5) If the user is now operating, or is about to operate, a 
finished system, he may specify safety analyses which he 
already knows he needs. The specific problems he poses 
may dictate the method of analysis to be conducted, either 
directly or by inference. If not, compare his stated 
safety problems with 1 through 4 above. 

The type and character of the safety problems should be form- 
ulated and the best method selected which will provide the re- 
quired outputs, and will scope the system level for which the 
safety problem is formulated. 

Finally an assessment of the available data must be made to 
determine the possibility of providing the required analytical 
outputs with the method selected (see Section 3). After screening 
the methods in such a manner, several methods may still appear 
to be practical. The analysis method requiring the least overall 
effort is normally chosen in that case. However, if the analysis 
of the immediate safety problems will point out additional areas 
where analysis will be required, then consideration must be 
given to using the method which provides a baseline for further 
analytical work. This may cause the analyst to recommend a 
method which involves a more extensive original analytical .effort 
than would otherwise be chosen, so that material savings will be 
realized in future safety analyses. 

An example of method selection drawn from actual experience 
on the Apollo Program is provided below • 

The combined System Safety organization of NASA, Boeing TIE, 
and Bellcom conducted meetings to compile a list of possible 
potential accidents in the Apollo program. The accidents were 
prioritized on the basis of program experience, mission crit- 
icality and expectations of the likelihood of occurrence. The 
top pr^r 'ey safety problems centered around the Astronauts 
who v ere to fly each manned mission. The analytical problem ■ 
was - ueiir.ca in qualitative terms and, in essence, said - 

"identify all hazards which cu.y cause death or injury of 
the Flight Crew from the time of entry into the launch 
pu'.i \ iv.nr.'-iiy Spare <'<■• -’a • +hro-' -rV Q i j followin'’ mission 
phases including splashdown and recovery from the Command 
Module of the space cr; ft." 
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2.0 (Continued) 

Several methods of analysis could provide hazard identific- 
ation, but fewer methods could provide the relative criticalities 
of the risks incurred by the Flight Crew as they came within 
the area of influence of each hazard. Some means was required 
to identify those hazards for which the present risks were 
acceptable. The ideal method would provide numerical prob- 
abilities of each hazard causing the accident to be avoided, 
namely death or injury to one or more Astronauts. Fault Tree 
(logic diagram) and Failure Mode, Effects, and Criticality 
Analysis (FMECA) became the candidate methods. 

A review of the available data disclosed that failure data 
would be very difficult to obtain in the form needed, and 
that in some cases the data sample was very small. This is 
characteristic of a system for which a low production quantity 
is required, such as a research program like Apollo. This 
forced the reliance on relative assessments of criticalities 
for each hazard identified. The lack of exacting failure data 
indicated that a better perspective of the problem could be 
maintained with the Fault Tree method rather than the FMECA 
method. The availability of some failure history, equipment 
level FMEA's and other types of engineering analyses was con- 
sidered to fit into the Fault Tree method better than FMECA. 
Further, the analysis team was spread from East Coast to West 
Coast and team membership involved several agencies. The Fault 
Tree method provided an efficient communication and analysis 
management tool. The final considerations were analytical 
resources and the long term System Safety analysis requirements. 

The potential accident of death to the Astronauts only began 
the list of many potential accidents which the user, NASA, 
wished to prevent. The utility of the Fault Tree in a complex 
study area, it's capability to keep pace with the changeability 
encountered at this program level and the detail analysis 
documentation it provides, form an excellent baseline for 
future analysis. This baseline allows maximum conservation 
of analytical effort, and thereby minimizes long term manpower 
requirements. Had the study area been confined to a less 
complex system, s..y the Saturn Booster, then the FMECA approach 
may have been selected, particularly when consideration had 
been given to the analyses already in progress for that level 
of system study and the time available to complete the system 
safety analysis. 
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METHOD SELECTION MATRIX 
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System safety studies must provide management visibility and 
engineering counsel regarding the safe construction and 
operation of systems. To accomplish this purpose there are 
several types of analysis results, or outputs, which may be 
reported singly, or in combinations which are most productive 
in terms of safety assurance in a given situation. These are 
listed as output requirements cn the matrix on page 2-6. 

The method of analysis should be effective for the study area 
under consideration from the viewpoint of time, cost, and method 
capability. The study areas are listed across the top of the 
method selection matrix. 


-J 

Z 

o 


The an alysis methods are shown at the intersecting columns and 
rows for study areas and output requirements. These are 
suggested only as a guide, and use of the matrix should not 
replace an assessment of each specific situation. 
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3.0 DATA INPUTS 

3.1 TYPES OF DATA 

The system safety analyst will find that data required to conduct 
an analysis of a system are large in quantity and vary consider- 
ably. The quantity of data required depends on the size and 
complexity of the system to be assessed. However, the types of 
data that must be collected for the analysis are predictable. 

These types are discussed in the following paragraphs. 

3.1.1 System Function and Description 

In tne conceptual phase system specifications should be gathered 
before the analysis begins. Procurement of the system is con- 
trolled by requirement specifications that define the user's objec- 
tives, design constraints, and requirements such as conformance to 
standards or codes. 

In the developmental phase system design drawings must be gathered 
as the analysis begins. The most useful of these are system func- 
tional logic diagrams or flow diagrams. In all analyses, great use 
is made of system schematics; and in some analyses, module, drawer 
and component level schematics are necessary. Installation drawings 
are useful when assessing the possible effects of high energy 
release accidents such as high voltage shorts, explosions, and 
fires. Installation drawings help in the analyses of accident 
control equipment (inerting or water systems) and in assessing 
emergency egress capabilities. Detail part drawings are usually 
not useful except when safety critical components have been identi- 
fied in the analysis. Analyses vnich are conducted after the 
system is built t / be expedited by reference to technical manuals 
and operation and maintenance manuals. 

3.1.2 System Environment 

The system's environment may be determined from requirements 
specifications ar.d design constraints. Further environmental 
data may be required as the analysis develops, to answer specific 
questions about the effects of environment or. particular portions 
of the system. The environment may not be constant in time or 
may vary from one part of the system to another at any given 
point in time. It will be necessary to collect interface data 
which affects the system's function relative to safe use. Instal- 
lation drawings are useful if spatial relationships are pertinent 
to failure mode causes or effects. The energy sources in the 
system being analyzed ray not appear to be hazardous until the 
other systems in the accident, induced environment are known. 
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3.1.3 Failure Data 

Whether the analysis is going to be quantitatively evaluated or 
not, some failure data becomes necessary as it develops. Without 
any insight about relative failure probabilities, all failures may 
be considered equally likely. This will cause single failure 
points which are critical to safety to appear to be the most likely 
to cause an accident. Strangely enough, this may not identify the 
most critical failure potentials. Since the probability that a 
given fault will occur when it can cause the potential accident 
depends on both the failure rate and the total time it may be 
causative, multiple failures may be more likely to create the 
accident than one failure. Therefore, the probable time from the 
actual fault to the detection of that fault is required. If there 
is no means of "safing" the system upon detection of a critical 
fault, the time from detection to repair can be used. In the case 
of faults which will not be detected when they occur the best 
estimate to use is the time to periodic maintenance or the test 
frequency. 

Any data which helps the analyst select critical failures is con- 
sidered as "failure" data. A consideration of the safety factor 
in the design is helpful. If components are operated at or near 
their failure limits, the probability of failure is greater than 
if a large safety margin has been allowed. If the failure limits 
are not well defined for a component because of state-of-the-art 
limitations, then the chance for a design error in establishing 
safety factors is greater than when failure limits can be accurately 
estimated and proven in test programs. Usually when safety factors 
cannot be well established for the design, high factors are used. 
This in itself can sometimes pose a concern for the analyst. 

If FMEA's have been conducted for components, modules, etc., of 
the sytem, these can be used bo indicate the failure probability. 
FMEA's with quantitative evaluation are best, but caution is 
advised because the failure modes considered may not exactly co- 
incide with the failure mode required in the safety analysis. See 
Paragraph 4.5 on use of FMEA’s as an analytical tool. 

Direct, raw failure history obtained during test and operation of 
the ty bt,em is useful if found in sufficient quantity. Since direct 
history or: the components is usually not sufficient in itself, this 
nay be complemented by generic failure data from PRINCE, FARADA,* 
or other reliability failure data files. These generic rates are 
hard to use for two reasons. First, the stated failure rates in- 
clude all known r.odas of failure for that component. In some 
cases both primary and secondary failures have been grouped to- 
gether, and in others or.]y primary failures have been reported. 

The analysis normally requires failure rate for only a few of all 
possible modes, both pri: ary and secondary. Secondly, the condi- 
tions under which the failures actually occurred my be signifi- 
cantly different than tne operating conditions experienced by the 

* See Paragraph 3. 1.6 . 2 a and b 
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component in the system under study. This leads to "fudge factors" 
which are a large source of error in the final probability of 
failure of the component in question. The selection of the most 
accurate failure rate is therefore quite difficult and time con- 
suming. 

3.1.4 

System Simulation Data 


Employment of system simulation testing and data may provide an 
excellent basis for safety judgments and design decisions on new 
systems. A reasonable approximation of the use environment can 
be obtained by testing portions of the system which are deemed to 
be essentially independent or whose interaction with the rest of 
the system can be simulated. Additionally, some cause-effect 
characteristics may be developed mathematically upon a physical 
basis. This can be done with reasonable accuracy for electricul 
networks and structural components because of the accurate speci- 
fication of manufacturing tolerances and the ability to express 
theoretical relationships. 

3.1.5 

Other Studies 


When engineering studies of subsystems are found, they may be 
useful in avoiding a new analysis of the same subsystem. The 
analysis is more useful if a quantitative evaluation is provided 
for the probability of the failure or fault event of the subsystem. 

3.1.6 

Sources of Data 


Much of the data to be collected is found in engineering libraries, 
drawing files, and general libraries and information centers main- 
tained by both private and government agencies. The systems analyst 
will find, however, that most of the information procured from data 
centers must be complemented by information gained through direct 
interface with the orgin.i cations who create the ^ata. Well estab- 
lished communications with these organizations will facilitate both 
the understanding of the data collected, and will ensure that a 
knowing and realistic use is made of the information obtained. 
Misused data causes the creation of an -jr.-used analysis. The rest 
important quality of an analysis is validity. 

3. 1.6.1 

Data Generating Organizations 


a. Design Engineering 


Design Engineering is a source of valuable information on the 
operating and functional characteristics of the system. Know- 
ledge of proposed charges to the system can be acquired during 
the conceptual and initial design change stage, and suggestions 
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3. 1.6.1 a. (Continued) 

made to the designers to provide a safer, cost-effective change* 
System design changes which are needed for improved system 
safety can be discussed with the designers to select the most 
effective design alternative with respect to safety and system 
effectiveness. 

The interface between system safety amlysts and Design 
Engineering requires a "day-to-day" working relationship 
between members of each organization. The results of this c_ose 
relationship are inherently beneficial to both organizations. 

b. Maintainability 

Maintainability is a design discipline that provides for ease, 
economy and safety in all ma.' atenance functions and the use of 
maintenance equipment. Therefore, syrtem safety engineers work 
with Maintainability to perfor safety analyses on maintenance 
equipment and to certify the safety of maintenance equipment 
design and maintenance operations. 

c. Human Engineering 

Human engineering and system safety engineers must use human 
factor statistics as a part of the safety analyses. A study of 
man-machine relationships complements system safety by providing 
additional emphasis on human error analysis and error reduction. 
These are c. ' x ical considerations in determining potential system' 
modes that can result in hazardous conditions. Identification 
and analysis of the overall hazardous consequences of a given 
failure event require an understanding of human capabilities and 
limitations as well as Ihe interfaces between subsystems , systems, 
and environments. Man-machine relationship •- to be effective 
must be integrated with system safety to provide a logical and 
consistent continuum throughout the life span of the aexjspucu 
system. 

d. Reliability 

A function of Reliability is system hardware analysis for failure 
data; such us failure modes, failure effects, me-n time between 
failures, probabilities of failure and assessment of system 
failures on mission accomplishment. Much of this type of data 
is used for both qualitative and qv.antitr.tive system safety 
analysis. For example, existing and substantiated failure 
modes and effects aata is an invaluable aid in the qualitative 
logic diagram analysis of a system. In a quantitative logic 
diagram evaluation, hardware failure rate data is a necers-ry 
item. Conversely, the results of a system safety analysis may 
have a direct impact upon reliability; such as requiring further 
• cert- in hardware or improving the reliability of a 

particular system element, to decrease the likelihood of syuL.c_, 
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3. 1.6.1 d. (Continued) 


damage or human injury. It should be noted that complete 
numerical parity should not be expected because reliability 
"numbers " normally refer to both primary and secondary fail- 
ures for particular failure modes. Thus, it is entirely 
possible for a system to have reliability which is the com- 
plement of one failure per 1000 operating hours and a proba- 
bility of an injurious or damaging undesired event of one 
per 1,000,000 operating hours. 

e. Health and Safety 

System Safety is concerned with test, assembly, checkout, 
maintenance and use of systems which provide a possibility 
of serious injury, loss of life, loss of equipment or signi- 
ficant equipment damage as a result of the existence of the 
system. Health and Safety is concerned with providing a safe 
working environment for employees. There is some overlap be- 
tween the two functions and in this case the more stringent 
standards of acceptability would apply. 

The Health and Safety activity can aid system safety engineers 
by providing information and data on human factors, toxic 
materials, anthropometric considerations and other specialized 
data related to the human working environment. 

f. Quality Assurance 

The system significance of a particular event or part detail 
cannot be determined by study of the design alone. Therefore, 
predictive system safety analyses must be made from drawings, 
procedures and other documented instructions. The accuracy of 
each analyses and the conclusions derived from them are depen- 
dent on activities of quality technicians and inspectors in 
assuring that instructions are followed. 

Quality requirements are determined and satisfied throughout 
all phases of contract performance. The Quality Assurance 
program errures that quality aspects are fully included in all 
designs and that high quality is obtained in the fabricated 
article^. Any change reauired to improve components, subsystem, 
or system } crforrance without compromising quality, reliability 
or safety should be incorporated at the earliest practical point 
in development and fabrication. The Quality Assurance program 
provides for the eurjy and pron.pt detection of actual or poten- 
tial deficiencies, system incompatibility, marginal quality, and 
trends or conditions which could result in unsatisfactory quality. 
Objective evidence of quality conformance, including records of 
inspection and test results is useful data for system safety 
analyses to provide a high level of confidence in the reprosentn- 
' « ' y ‘ .*’-1 u; uU:. f_ .1 ! ,s. r.r.I confllcice in the assign, .--h. 
of probabilities to the fault events. 
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3.1 .6.1 (Continued) 

g. Test Planning and Reporting 

Special testa are conducted on hardware end items for 
reliability data, qualification, quality assurance, and 
system hardware integration. From these tests consider- 
able data is produced which is useful for system safety 
evaluation. Conversely, requirements for special tests 
to obtain data specifically needed to assure system safety 
may result from system safety analyses. 

System Safety analyses conducted on proposed test plans 
may initiate special test procedures and corrective 
measures to existing test plans. 

h. Configuration Management 

Configuration Management describes, identifies, and 
controls system configuration throughout the definition, 
development, production and change phases. System safety 
analyses require a well defined baseline configuration so 
that changes in configuration may be assessed after the 
basic system analysis is completed. Establishing the base- 
line configuration engineering data is a function of 
Configuration Management. 

3. 1.6. 2 Data Storing Organizations 

Specific organizational sources of data for the conduct of 
system safety analyses are listed in AFSC Design Handbook, 

Mi 1-6, Chapter 2. Brief descriptions of four large data 
storage and retrieval organizations are included here to typify 
what is available to systems analysts. 

a. Parts Reliability Information Center 

The NASA Parts Reliability Information Center (PRINCE) is 
a specialized data center developed and maintained by the 
George C. Marshall Space Flight Center. The PRINCE provides 
an a at orated data storage end retrieval system containing 
technical information which is useful to reliability 
analysts. The data contained can also be used by system 
safety analysts in compiling specialized failure history 
for • r, lysis evaluations. 

b. Failure Rate Data Handbook 

The FARADA Program document is a component part "Failure 
Rate Data Handbook" (FARADA). Updating and expansion of 
the data is accomplished by the FARADA Information Center 
at u. C. I.u / Giunu:a,e Lu^ora «,oiy, Corona, California. 
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The Handbook contains component and part information 
relative to failure rates generated by contractors and 
agencies engaged in design, development and production 
of military and space program equipment. The failure 

rates contained in the Handbook are obtained from 
specific engineering data and test results. 

Defense Documentation Center 

The Defense Documentation Center (formerly ASTIA) is a 
large storage and indexing program of all types of 
scientific and technical information from many sources 
including federal agencies, industrial concerns, educa- 
tional institutions, and research foundations. Information 
on hardware, software and complete systems is available, 
and many references and papers on analytical procedures 
and methods are easily found in the Center. 

Interservice Data Exchange Program 

The Interservice Data Exchange Program (IDEP) is a data 
storage and filing program which can be used by the 
analyst to acquire info motion for system safety assessment 
- t all levels of complexity from components to complete 
programs or projects. The objectives of the IDEP program are: 

1 . To avoid repetition of tests already satisfactorily 
accomplished. 

2. To provide prompt indication of possible failure modes. 

3. To reduce duplicate expenditures for developmental 
parts testing and non-standard parts justification. 

4. To encourage standardization of methods of test 
and test reporting. 
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This section describes various qualitative and quantitative 
techniques which may be used in safety analysis. A brief 
discussion of data sources available to the safety analyst, 
and methods to resolve identified hazards are included. 
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The complexity of present and proposed aerospace systems, the 
number of individuals and organizations involved in their 
development, and the inherent desire for multi-uission cap- 
ability all tend to create system safety problems. Increasing 
system acquisition and modification costs require that a system 
safety approach be identified early in the development stage 
so that it may have some impact upon design requirements and 
trade-off decisions. The degree of safety achieved in an aero- 
space system is a basic design problem} its resolution lies in 
the application of safety engineering and its assessment is 
gained through engineering analysis. 

Analyzing system and subsystem design is the fundamental act 
by which insight into safety design effectiveness can be 
accomplished. Without safety analysis, safety design defects 
are exposed by the unpleasant experience of accident investig- 
ation. 

The various safety analysis techniques to be discussed in this 
handbook are Gross Hazaras Analysis, (4.1); Operations and Test 
Safety Analysis and Operations Safety Research, (4.2); Fault 
Tree or Logic Diagram Analysis, (4.3)} Fracture Mechanics 
Assessment, (4.4;; and Failure Modes, Effects and Criticality 
Analysis, (4.5). 


Cautions in Safety Analysis 

Although various safety analysis techniques may be available, 
these should not be regarded as tools to be applied to every 
design problem, particularly those where a definite alternative 
is clearly the proper solution. Statistical and analytic..! 
techniques are not a replacement for ccmncn sense. This is 
particularly true in analyzing research and development programs. 
Employment of a mathematical technique may indicate that the 
probability of an undesirable event occurring due to a given 
set of circumstances is 1 x 10“^, If the event would cause 
loss of the system ar.d can be precluded without significant 
cost or degradation of performance, why accept any risk? The 
concept of establishing an acceptable level oi risk can result 
in acceptance of unnecessary risk. The purpose of safety 
analysis i3 to expose hazaraa an a uiniudze or preclude risk. 
Predictions may be inaccurate by a magnitude when an event is 
associated with human behavioral variances. 
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4.1 GROSS HAZARDS ANALYSIS (See Appendix A) 

4.1.1 Summary Description of Technique 

The technique of gross hazards analysis is a comprehensive , 
qualitative hazard assessment applicable to complete systems 
or major segments of a system. The gosss hazard study should 
be conducted early in the design phase or modification phase 
of the system. 

A good gross hazards study will -identify critical areas of the 
system, product, or end item which should be subjected to addi- 
tional safety analysis or which indicate a need to change a t 
design requirement. The study will also provide management 
personnel with visibility of the adequacy of safety features of 
the system and information about the likely contingency conditions. 
The study should help to identify routine or special test require- 
ments and will be very valuable in establishing priorities to 
allow scheduling and manning of the safety effort. A necessary 
result of the gross hazard study will be the establishment of 
upper and lower limit definitions for standard hazard categories 
in terms of the system under study. Controlling design criteria 
such as, existing codes, regulations, standards or policies and 
procedures may be identified to assure coverage of all gross 
hazards identified in the study. Any gross hazards which have 
been identified, and for which no controlling design criteria 
exist, should be covered by specific criteria in the gross 
hazards study. 

4.1.2 Applications of Gross Hazards Analysis 

4. 1.2.1 Priorities and Ground Rules 

The gross hazards study will allow the definition of the system 
safety task. With this task defined for the system under study 
it will be possible to establish system safety goals and priorities 
in accordance with established mission or contract objectives. 

The analysis schedule end manpower requirements may then be 
planned through the program phases which have been forecast. 

Standard hazard categories spelled out in terms of the system 
under study should be clearly defined. The -'per and lower 
limits of each hazard category should be cleaily defined because 
these will establish the ground rules for setting goals and 
priorities. 
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4.1. 2. 2 Design Control Criteria 

Criteria to be applied to the system during design activity to 
minimize hazards to personnel or equipment should be identified 
for the designers. This criteria will include existing safety 
codes, regulations and standards as well as design standards, 
codes, and procedures applicable to the system, subsystems and 
components under study. Where existing criteria are inadequate 
for the level of safety desired, planning to correct the in- 
adequacies should be initiated. The types cf follow-on safety 
analysis required to continue the system safety analysis should 
be specified in accordance with the advantages, including cost 
effectiveness of each type of analysis. 

4.1 °-3 Implementation 

Action items which result from gross hazard studies should be 
specifically assigned to assure completion. Assignments for 
specific phases of the analysis which may be performed by 
designers and personnel other than the system safety analysts 
should be planned and prioritized to the level of detail 
necessary to assure successful completion of the study. 

4.1.3 Input Data Required for Gross Hazards Analysis 

The gross hazards analyst must be supplied with the system 
specifications, diagrams, manuals, procedures, requirements 
and history for use in familiarization, evaluation, and planning 
corrective action. Hazard and failure experience of similar, 
related or interfacing systems should also be obtained. (See 
Figure 3-1). 

4.1.4 Gross Hazards Analysis Procedure 

The basic gross hazards analysis procedure consists of breaking 
the system down into units of various types, by use of functional 
flow diagrams or other techniques, and then subjecting each unit 
to analysis for gross hazards. 

All systems have a purpose. To achieve this purpose, operation 
or functioning of the system can be broken down into a series of 
steps or functions. These steps or functions are inter-related 
in such a way as to perform the purpose of the system. The 
functions or steps, and their relationships, can be shown in a 
form commonly known as a "flow diagram". Flow diagrams can be 
prepared to show as much detailed information as is desired. The 
amount of detail required in flow diagrams prepared for a given 
system is a function of the depth of analysis required. Common 
pru::i;.e is ao l -.jin with a gross functional flow diagram and 
prepare succeedingly more detailed diagrams until the desired 
level of detail is achieved. 
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4.1.4 (Continued) 

Some flow diagrams may have already been prepared on a system 
as an 'id to basic system design. However, if the analysis 
must be conducted on a system which is still in a preliminary 
design stage, few flow diagrams will have been prepared. Prep- 
aration of necessary system flow diagrams must, therefore, be 
accomplished through the safety analysis function. The process 
of preparing these flow diagrams can provide system under- 
standing, more detailed identification of system hazard areas, 
a basis of communication with other engineering functions, and 
generates information for more detailed safety analysis. 

When a gross hazardous condition is identified, the system 
event, subsystem, operation or facility is listed as a safety 
critical item. The listing should include a specific descrip- 
tion of the hazard. 

Each identified gross hazard should then be eliminated, circum- 
vented or controlled by a recommendation from the system safety 
organization for an engineering change to the design, or a 
procedural change, or both. 

If the fault which leads to the gross hazard cannot be readily 
determined, a recommendation for more detailed safety analysis 
should be made. 
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4.2 

OPERATIONS SAFETY ANALYSIS (See Appendix B) 

4.2.1 

Summary Description of Techniaue 


The technique -of Operations Safety Analysis is a means of 
identifying tasks that are hazardous in the operation of 
a system. There are tvo major areas of consideration. In 
this handbook they are divided into Operations and Test 
Safety Analysis and Operations Safety Research. 

4.2.2 

Application of Operations Safety Analysis 


The results of OSA's, specifically safety requirements for 
each task, can be used as either direct input to the detailed 
procedures for the task, or can provide a baseline for criteria 
standards, manuals, or handbooks against which the detailed 
procedure is written. 

4.2.3 

Input Data Required For Operations Safety Analysis 


The operational safety analyst will require as basic data the 
project requirement specifications, the system specifications, 
the operating procedures and the appropriate safety procedures 
and regulations that have been established for the type of 
operation being analyzed. In addition, test requirements and 
test and checkout procedures are needed for OSA-I. Many other 
types of data can be useful as indicated in Figure 3-1* 

4.2.4 

Operations Safety Analysis Procedure 


Since each of the major areas of consideration are unique, the 
analysis procedures are described separately. 

4.2.4.1 

Operations and Test Safety Analysis (OSA-I) 


The Operations and Test Safety Analysis (OSA-I) method identi- 
fies operations that are inherently hazardous or, which by the 
nature of the function sequences, can lead to development of 
hazards in the operation of a system. This method can be used 
in all aspects of system operation from construction to mission 
termination. 


The objective of performing OSA's is to ensure that hazards, 
existing or developing during a particular task, are identified, 
documented and brought to the attention of the proper authorities 
for resolution. Such hazards may result from the task itself, 
or from interaction of other work being done concurrently with 
the tr.sk. The OSA's will include corrective action recommend- 
ations which serve to eliminate these hazards, or reduce them 
to an acceptable level. Each task is reviewed and the reason- 
ing for a particular safety requirement is recorded to substanti- 
ate program decisions. | 

- - , ■ — - ■ I 
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4.2.4.1 (Continued) 

Each task (act, process, or test) can be analyzed Individually 
to ensure complete investigation of all situations requiring 
safeguards, special equipment, or specific instructions (e.g. , 
cautions, warnings, or verifications) to avoid personnel injury 
or significant equipment damage. Previous analyses of hazards 
in specific areas of operation should be used to the maximum 
extent. 

4. 2. 4.2 Operations Safety Research (OSA-Il) 

As the name implies, operations safety research involves the 
safety research of operations. In this method, operations are 
researched to determine how to create and use systems in the 
safest manner. The techniques used in operations research pro- 
vide a scientific approach to decision making that involves the . 
operations of a system. The relative safety of alternatives is 
a characteristic of the system similar to reliability, maintain- 
ability, cost effectiveness, flexibility, and operability. The 
use of operations research assumes that the system user's 
objectives include maximum safety within the constraints of 
minimum cost and other objectives of the missior. 

The principal techniques of operations research which may be 
applied to optimizing system safety are Linear Programming, 
Network Analysis, Dynamic Programming, Game Theory, Queing 
Theory, Markov Chains, and the techniques of Simulation. All 
systems engineering analysis methods use these techniques to 
some degree, because of the fundamental nature of the problem 
of systems analysis and design. This problem is concerned with 
achieving a balance of many conflicting parameters and variables 
to accomplish the objectives of the system user. A brief expla- 
nation of the Linear Programming method and Network Analysis 
are provided in Appendix B, Part II. 

4. 2. 4. 3 Human Error Prediction Techniques 

In both of the above Operations .Safety Analyses, a consideration 
ox’ possible human error may be appropriate. 
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4.3 FAULT TREE ANALYSIS (See Appendix C) 

4.3.1 Summary Description of Teehninna . 

The System Safety fault tree logic diagram analyeia method 
consist? of three basic analytical elements; viz: - 

1* System Safely fault tree development 

2. " " failure data development 

3. " " fault trea evaluation. 

The System Safety fault tree is a logic oriented graphic 
representation of independent failure combinations which may 
interact or may singly produce system failures or undesired 
events within normal system operating modes. The diagram alone 
is a qualitative tool. When combined with failure data inputs, 
an evaluation can be made and dominant paths can be identified. 
The analysis then becomes an effective quantitative approach 
to accident prevention. 

The following steps are essential as a basis for a systems 
approach to safety and will enable identification of undesired 
(hazardous) events which are to be maintained at an acceptable 
level: 

1. Identification of undesired events; 

2. Structuring of undeslred events into a logic diagram; 

3. Determination of fault inter-relationships; 

4. Evaluation for "likelihood” of undesired events; and 

5. Trade-off decisions and/or corrections. 

Steps one and two are necessary to develop a "Top" logic diagram 
which serves as a guide showing how and where the tree is to be 
developed (or expanded) by further analysis activity. The "Top" 

.. v. logic diagram organizes all pf the logic relationships unique to 
a system into a pattern which provides an orderly and logical 
manner for analyzing the system hardware and software functions. 

The variable logic relationships which are unique to a system 
and must be structured are such things as: (l) operating modes, 

(2) mission phases and/or operations, (3) degree of man/machine 
relationship in the system (4) ihter-relation3hip3 of the Centers 
with the system functions, and (5) functional order of the system. 

Step three is the development of the fault tree analysis which 
starts with the "Top" logic diagram structure and proceeds 
through hardware level. 
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4*3*1 (Continued) 

Step four is an evaluation of the completed logic diagram for 

(a) determining the likelihood of undesired events, and 

(b) determining the identity and ranking of series of events 
and event relationships leading to the undesired event (s). 

Step five is a further assessment of the analysis results to 
determine what corrective action is required. Proposed corrections 
such as design changes, procedure changes, training methods, 
added safety features, etc., can be evaluated in the context 
of the fault tree for the desired improvement. 

Two points are vital to a meaningful and useful analysis. First, 
the output of an analysis is only as valuable and reliable as 
the effort and information applied to the analysis. Second, con- 
figuration control of the hardware and the operating procedures 
must be maintained lest erroneous conclusions be drawn from the 
analysis. 

System Safety fault tree analysis is dependent and complementary 
to many other engineering functions. These include: 

1. Configuration management for a baseline configuration, 
changes, specifications, requirements, verification and 
certification of manufactured end items, data on operating 
time or cycles, and schedules on approved changes. 

2. Design engineering for information on the operating and 
functional characteristics of the system and the proposed 
changes. 

3. Quality assurance for providing a level of confidence that 
the equipment and system conform to the documentation. 

4* Test and operations for plans and data which may be used 
in the fault tree evaluation. 

5. Reliability for such failure data as failure modes, effects 
and criticality analyses, failure rates, mean-time-between- 
failures, failure probabilities, and assessment of system 
failures on miss 4 ^;'. ncconplishnent. 

6. Maintainability for maintenance functions and use of main- 
tenance equipment. 

7. Human engineering for equipment design characteristics 
providing efficient, accurate and safe utilization of the 

equipment by the operators. 

8. Health and safety for provisions of a aafe working environ- 
ment for employees. 
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4*3*1 (Continued) 

While it is recognized that there is a significant degree of 
inherent compatibility between System Safety analyses and 
reliability, complete numerical parity should not be expected. 
Reliability figures refer to both primary and secondary 
failures for particular failure modes* 

A system may have a reliability which is the complement of 
one failure per 1000 operating hours but the probability of 
a significant undesired event (accident) could be one per 
1,000,000 operating hours. It is possible that safety consider- 
ations make it necessary to attain greater reliability from 
some equipment even though the system reliability is already 
adequate to perform the desired mission. 

4*3.2 Applications of Fault Tree Analysis 

The fault tree method is generally appli coble at any level of 
complexity of system or any size of study area. The cost-effec- 
tiveness of the fault tree method remains approximately constant 
at all levels except when analyzing only detail part 3 , and no 
system analysis is required. Fault tree-methods are especially 
well adapted to large program level analyses. When the method 
is applied in program wide study areas, exceptionally strong 
technical communications between the analysts involved must be 
established at the beginning and maintained throughout the 
analysis. The analysis of system operating modes and phases at 
the top of the tree progresses mere slovly than analysis at the 
hardware level because of the many alternatives usually encoun ;ered. 
However, the fault tree development at the top levels, where many 
of the contingencies and operating alternatives are sorted out, 
can point out any large risks inherent in the system. For example, 
in the Apollo program, the sequence of missions and their assoc- 
iated objectives greatly affect the risks incurred by the astro- 
nauts. The top tree may point out these incurred risks, and a 
new sequence can be modeled to assess the trade off benefits. 

4.3.3 Input Data Requirements For Fault Tree Analyai n 

After defining the scope of the oy to be analyzed, certain 
information must be gathered so that v.ie system may be char- 
acterized and pertinent aspects simulated for analysis. (See Fip.3-1' 

4. 3.3.1 System Function and Description 

System specifications should be gathered early. These will not 
only provide a description of the system, but will explain why 
certain design concepts are used when the analyst is studying 
system logic diagrams, flow diagrams ana schematics. Detail part 
drawings are seldom useful, unless the a.alyst is totally 
unfamiliar with the components and modules in the system. Analyses 
conducted after a system is built can be expedited by reference to 
technical manuals and operation and maintenance manuals. . 
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4. 3. 3. 2 System Environment 

The system's environment may be determined from requirements 
specifications and design constraints. Further environmental 
data may be required as the tree develops, to answer specific 
questions about the effects of environment on particular portions 
of the system. The environment may not be constant in time or 
may vary from one part of the system to another at any given 
point in time. It is sufficient in the beginning of the analysis 
to collect general environmental data, and gather detailed data 
only as required. Since other systems which interface with the 
system under analysis form part of the environment, it will be 
necessary to collect interface data which affects the system's 
function relative to safe use. Installation drawings are useful 
if spatial relationships are pertinent to failure mode causes or 
effects. The energy sources in the system being analyzed may not 
appear to be hazardous until the other systems in the accident 
induced environment are known. 

This inter-system effect may cause some difficulty if the adjoining 
system is outside the scope of the authorized analysis. A judge- 
ment must be made about the extent of analysis required to complete 
the fault path in the other system to the potential accident. 

Since a finding such as this reverses the basic fault tree process, 
a new study should be recommended fcr potential accidents caused 
by the affected adjoining systems. If the top potential accident 
is defined in sufficiently narrow terms at the outset, this 
reversal may never occur. It is extremely difficult, however, 
to turn away from a legitimate safety concern because it falls 
outside the range of the original task. This facet of fault tree 
analysis, which seems to lead the analyst, is most beneficial 
because it points out problems which would not normally be detected. 
This aspect also poses a problem to the system safety manager, 
since he must guard against losing sight of the original problem. 

4.3.3. 3 Failure Data 

Whether the tree is going to be quantitatively evaluated or not, 
some failure data becomes necessary as the tree develops. Without 
any insight about relative failure probabilities, all failures 
ray be considered equally likely. This will cause single failure 
points and paths adjoining them through OR gates to the potential 
accident to be critical. Strangely enough, this may not identify 
the most critical paths. Since the probability that a given 
fault will occur when it can cause the potential accident depends 
on both the failure rate and the total time it may be causative, 
multiple (simultaneous, sequential, or random) failures cay be 
more likely to create the accident than one failure. Therefore, 
the probable time from the actual fault event to the detection 
of that fault is required. If there is no means of "safing" the 
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4. 3. 3. 3 (continued) 
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system upon detection of a critical fault, the time from 
detection to repair can be used. Maintainability analysts 
should be able to provide accurate estimates of the required 
period of maintenance. In the case of faults which will not 
be detected when they occur the best estimate to use is the 
time to periodic maintenance or the test frequency. If safety 
is truly jeopardized in the case of undetected failures, 
increased test or maintenance frequency may be a sound solution. 
The addition of a monitoring device may be advisable, if it 
does not create an increase in the hazard level or increase the 
probability of the occurrence of the basic fault event. 

Any data which helps the analyst select critical paths is 
considered as "failure" data. At one extreme, the analyst may 
have some expert provide a qualitative assessment, or he may 
have to rely on his own judgement on each component failure or 
basic fault event. A consideration of the safety factor in the 
design is helpful. If components are operated at or near their 
failure limits, the probability of failure is greater than if a 
large safety margin has been allowed. The possible effect of 
the man-machine interfaces from design through use should be 
"added" to this safety factor mile. 


4. 3. 3.4 Other Studies 


When engineering studies of subsystems are found, they may be 
useful in avoiding a second analysis of an undesired event in 
the same subsystem using the fault tree. An PMEA of the sub- 
system may include the failure modes needed. The JfMEA is more 
useful if a quantitative evaluation is provided for the proba- 
bility of the failure or fault event of the subsystem. See 
Section 4.5 on the use of BMEA* s as an analytical tool. Engin- 
eering analyses other than fMEA can also be used to supplant 
further development of the tree for an undesired event. It is 
often helpful to informally extend the tree beyond the level 
that the engineering analysis is to be used when assessing the 
adequacy of the substitution. Three or four levels of tree 
usually are sufficient for this purpose. 
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Fault Tree Procedure 

The fault tree Is a logic oriented graphic representation of 
parallel and series combinations of independent failures and 
operating modes that can result in a specified undesired event. 

The digram can be quantified when required to provide a relative 
measure of the paths leading to the events. 

The term "event" denotes a dynamic change of state that occurs 
to a system element, which may be hardware, software, personnel 
and/or the environment. If the event results in not achieving 
the intended function, or is achieving an unintended function, it 
is known as a fault event . Conversely, if an intended function is 
achieved as planned, it is known as a normal event . 

Fault events may be basic events or gate events. Basic events are 
independent events whereby system elements (usually at component 
level) go from an unfailed state to a failed state and they are 
related to a specific failure rate and fault duration time. Basic 
events are used only as inputs to a logic gate. 

A gate event is one which results from the output of a logic gate 
and is therefore a dependent event. As a fault tree progresses, 
gate events on one level become inputs to gate events on the next 
higher level. 

In fault tree analysis the inherent modes of failure of system 
elements are referred to as primary events, secondary events and 
command events, and are depicted on the fault tree as the combina- 
tion of basic events and gate events. Primary, secondary and 
command factors are defined as follows: 


Primary Failure : 


Secondary Failure : 


Command Failure : * 


Failure initiated by failures within, and of, 
the component under consideration, e.g. , 
resulting from poor quality control during 
manufacture , etc., applied only to the com- 
ponent during Fault Tree Analysis when a 
generic failure rate is available. 

Failure initiated by out of tolerance opera- 
tional or environmental conditions, i.e., a 
component failure can be initiated try failure 
not originating within the component. 

The component was commanded/instructed to fail 
i.e., resulting from proper operation at the 
wrong time or place. 


“Component may not always have command failure mode (e.g, a 
standard bolt) in which case this mode may be disregarded. 
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4*3.4 (Continued) 

The development of a fault tree starts at the top or undesired 
event. The analysis determines what events can cause the un- 
desired event. These become inputs to the top event. They can 
be two or more events, any one of which can cause the top event. 
Otherwise, they can be two or more events all of which must occur 
at the same time to cause the top event. The first group pass 
through an "OR" gate to get to the top event. The second group 
pass through an "AND" gate to get to the top event. The analyst 
then determines what can cause the input events. Each branch can 
be developed ini: <pendently or concurrently. At some level below 
the top event the analyst will arrive at a piece of hardware (or 
subsystem). Each piece of hardware (or subsystem) can fail in 
three or less ways (i.e., primary failure , secondary failure, or 
commanded failure). 

The dynamic change of state is defined as a binary type event, 
being either in the ON or OFF state. The ON state (or l) corres- 
ponds to a failed condition and the OFF state (or 0) corresponds 
•to an unfailed condition. By representing events and gates in a 
binary manner, logic diagrams can be analyzed by the techniques of 
Boolean algebra. 


OUTPUT 



INPUTS 

OUTPUT 



INPUTS 


FAULT TREE SYMBOLS 


AND GATE describes the logical operation whereby 
..tfee coexistence of cJLl_input_eyents^is requires 
"to produce the ouxput event. When hand sketches 
of fault trees are made a dot is placed in the 
center of the symbol to avoid confusion to the 
draftsman, thus • 


OR GATE defines the situation whereby the 
output event will exist if one or more of 
the input events exists. When hand sketches of 
fault trees are made a plus sign is placed in 
the center of the symbol to avoid confusion to 
the draftsman, thus . 


The rectangle identifies an event (gate event) 
th&t results from the combination of fault 
events through a logic gate. The words describing 
the event are placed within the box. When machine 
drafting with eoi.putor control is used, the co\ - 
puter program will limit the number of character 
spaces that can be used in any one block. 
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4.3.4 (Continued) 



•The diamond describes a fault event that is 
considered basic in a given fault tree. The 
possible causes of the event are not developed 
either because the event is of insufficient 
consequence or the necessary information for 
further development is unavailable. It also 
can indicate non-development because an ~ 
analysis already exists that is of satisfactory 
depth and breadth. In any case the reason 
should be stated, either in the symool box or 
in cross-referenced notes. 



The circle describes a basic fault event that 
requires no further development. The frequency 
and mode of failure items so identified is de- 
rived from empirical data. The rate of occur- 
rence of such a primary event is normally the 
generic failure rate of the component for the 
particular failure mode. 




The transfer triangle indicates 
that a section of the fault tree 
is drawn once and used in more 
than one plc.ce on the tree. If 
the triangle is drawn under the 
event block, it means that the 
diagram that would appear under- 
neath is drawn under some other 

event box in the tree. Since all events and logic below the triangle ax 
transferred from one event to another, all necessary and sufficient 
conditions to cause both events must be exactly similar. If the tri- 
angle is drawn at the side of the event block, it means that the dia- 
gram drawn below is used in it*s entirety to satisfy the input condi- 
tions for more than one event. The event designation within the box 
io identical on both diagrams. Cross reference between a transferred 


diagram and the events which use it is accomplished by coding the 
triangles with the same letters or numbers. 



The numbers and letters appearing in the symbols above are coding 
devices to permit the dir.gr .ms no be drawn by a computer controlled 
dr. ft lug machine. They are also used to identify an event; for 
example, "the E-4 event on the IIT Diagram." 
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4*3*4 (Continued) 

A Sample System 

An automatic pas hot-water 
heater is a good example 
to use in illustrating the 
elements of a system. The 
task of the system is to 
provide hot water in our 
house at all times. In 
order to perforanihis task 
a system is used whose 
components consist of a 
water tank, a gas heater, 
a temperature measuring 
and comparing device to 
regulate the system, a 
controller (actuated by 
the temperature measur- 
ing device) to turn a valve 
to control the flow of the 
gas, a pressure relief 
valve (to permit excess 
pressure to escape if the 
heater fails to shut off), 
a cold water intake pipe, 
a hot water pipe leading 
to the faucets, and an 
exhaust pipe for the flue 
gases from the gas heater. 


EXAMPLE OF A SYSTEM 
(DOMESTIC HOT WATER SYSTEM) 


Hot Water Faucet 
( Normally Closed) 


Flue Cold 
Gases Water 


From the view of task 
performance, we can 
examine the system to 
see in vhut ways fail- 



Check 

Valve 


Stop 

Gas ________ 


Air 


Figure 4-1 

tire or malfunction of the components can stop delivery of hot water when we 
wont it, cr, more importantly, when the system might get out of control and 
the tank rupture or gas escape. The interrelations of the components are 
apparent to anyone familiar with the operation of such a heater and we can 
cra.ce through the system the effects of any component breakdown. 

In normal operation the tar.K is filled by cold water. The water temperature 
in the tank is monitored by the temperature measuring device and this temper- 
ature is comp_red with the preselected temperature. When the water temperature 
in the took is less than the desired temperature, the controller opens the gas 
-»\.lvo, allcui.g gos to flow to the burner. When the water in the tank reaches 
the desired temperature, the controller causes the gas valve to close, allowing 
no more gus t: . low to the burner. The pressure relief valve acts as a safety 
device by venting excessive pressure. 
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4*3*4 (Continued) 

Nov that the system is understood, ve should define our undesired 
event. This would be the rupture of the hot water tank. Having 
determined the undesired event, it is necessary to analyze what 
could cause it. For the tank to rupture, the water in the tank 
must overheat and the relief valve must be unable to open. It is 
now necessary to determine what could cause the water in the tank 
to overheat. Either the gas valve fails to close, allowing gas to 
flow to the burner, or the controller fails to actuate the gas 
valve, which would allow gas to flow to the burner, or the temper- 
ature device fails to actuate the controller, which also would 
allow gas to flow to the burner. 



Simple Fault Tree 
Figure 4-2 
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4.3.4 ( Continued) 

The fault tree in Figure 4-2 presents a very simplified 
analysis. This diagram is a graphic representation of logical 
relationships, and these may be expressed in Boolean algebra. 

Only if both event A and event N exist simultaneously, can 
event M occur. Events A and N have some probability of occurrence, 
P\A)and P(N) respectively. The probability that M occurs is 
expressed as, P(M) = P(A) x P(N). 

The fault tree in Figure 4-3 shows that N occurs if any one 
of the events B, C. or D occur. These events may occur in any 
combination, but only one must occur to cause event N. The 
probability of event N is expressed as, 

P(N) = P(B) + P(C) 4 P(D) 4- ^P(B) x P(C) x P(D)7 
-2p(B) x P(C) 4- P(B) x P(D) 4- P(C) x P(D)_7 

A complete derivation of this equation can be found in most 
texts on set theory or Boolean algebra. 

• In most cases, the probability of a failure event is quite small, 
i.e., in the order of 10“ 2 or less. If 10“ 2 is assumed as an 
upper limit then; 

p(n) = icr 2 4- icr 2 + io" 2 4 /To -®7 - 3 /To-£7 
= 3x1 0“ 2 - 299 x 1.0-6 
= 2.9701 x 10-2 

In the approximation, if 

P(N) = P(B) 4 P(C) 4 P(D) 

had been used, at most a one percent error would have been 
introduced. Failure probabilities are normally much smaller 
than 10-2, a nd the error of approximation would very likely 
be much smaller than one percent. 

Therefore, a valid approximation of the probability of the top 
event M is expressed, 

P(M) = P(A) x /"P(B) 4 P(C) 4 P(D)7 

Frequently the diagram in Figure 4-2 is all that is needed 
to lead the analyst to a sound conclusion. On the other hand, 
if it is necessary to trace out possible faults in each piece 
of component hardware then the logic diagram might look like 
Figure 4.3.4B. 
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4.3.4 (Continued) 

4 fault tree should be carried down only to the point that one 
is sure there is no additional significant data to be derived. 

It is pointed out, however, that if a quantitative analysis is 
desired, then the fault tree must be carried to 1&e level of 
component parts, or subsystems, which have had a failure rate 
that has been determined by test or analysis. Then by the 
application of Boolean algebra in combination with other failure 
probability computation techniques (Lambda-Tau or Monte Carlo), 
a probability of occurrence of the top undesired event can be 
calculated. 
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4.4 FRACTURE MECHANICS ASSESSMENT (See Appendix D) 

4.4.1 Summary Description of Technique 

Pressure vessels generally contain small flaws or defects, which 
are either inherent in the materials or are introduced during a 
fabrication process. These defects can in many cases cause a 
severe reduction in the load carrying capability and severely 
reduce the operational life spans of pressure vessels. If the 
flaws are large in comparison to that required to cause failure 
at the proof pressure stress levels, failure will occur during 
initial pressurization. On the other hand, if the initial flaws 
are small the vessels may withstand a number of operational pres- 
sure cycles and a number of hours of sustained pressure loading 
before the flaws attain the size needed for failure J ,c occur. 

From an economic standpoint it is important that the possibility 
of failure of launch vehicle and spacecraft, pressure vessels 
during proof testing be minimized. From tLo standpoint of econ- 
omics and personnel safety, it is imperative that operational 
failures be prevented. 

The primary purpose of this method is to set forth a criteria 
which, when followed, will minimize the occurrence of proof test 
failures and provide assurance against pre-flight and flight 
operational failure of launch vehicle and spacecraft pressure 
vessels. Within the constraint of "no service failures", the 
crit is intended to provide a maximum degree of latitude in 
the s exaction of materials and operational stress levels, detail 
design, analysis, and test in order to allow weight and cost 
minimization as may be dictated by specific vehicle and mission 
requirements. 

The method is applicable to metallic pressure vessels designed 
primarily for internal pressure. This includes high pressure 
gas bottles, solid propellant motor cases, and storable and 
cryogenic liquid propellant tanks - both integral and removable. 
Pressurized cabins, inflatable structures and vessels fabricated 
from composite materials are not included. 

The three basic considerations in the prevention of proof test 
end service faJlvrsro o ” ;-r;ullic pressure vessels are, the 
initial flaw sizes (K^), the critical flaw sizes (i.e., the 
sizes required to cause fracture at a given stress level (Ki.), 
and the subcritical flaw growth characteristics. The prevention 
of proof test failure is dependent upon the actual initial flaw 
sizes being less than the critical flaw sizes at the proof 
level. In order to rearer toe that the vessel will not fail in 
service, it is necessary to show that the largest possible 
initial flaw in the vessel cannot grow to critical size during 
the required life span. The basic parameters affecting critical 
flaw sizes are the applied stress levels, the matorial fracture 
touglr.eus Values, Iho press. .us^ol vn.ll L..i ucr.tss, the .law 
location and the flaw orientation. The determination of actual 
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4*4*1 * (Continued) 

initial flaw sizes is limited primarily by the capabilities of 
the non-destructive inspection procedures) however, as will be 
discussed, a successful proof pressure test provides a direct 
measure of the maximum possible initial flaw size. Subcritical 
flaw growth depends upon a number of factors Including stress 
level, flaw size, environment, pressure vessel material, and 
the pressure vs. time/cycle profile. 

Because of the many factors involved, it is unlikely that the 
problem of premature fracture will be completely resolved in the 
immediate future. However, during the past ten to fifteen years 
significant progress has been made in several different areas 
(i.e., mechanics, metallurgy, inspection etc.) with the accomp- 
lishments in the field of fracture mechanics being particularly 
significant. Linear elastic fracture mechanics has provided a 
basic framework and engineering language for describing the 
fracture of materials under static, cyclic and sustained stress 
loading. The technical approach used in developing the criteria 
set forth in this document is based on this framework. 

4.4*2 Application of Fracture Mechanics Assessment 

In Aerospace work, systems frequently require use of pressure 
vessels, both thin walled and thick vailed. Because of weight 
or space restrictions it sometimes is necessary to reduce the 
normal safety factors used in the design of such vessels. 

Experience indicates that small flaws in the vessel structure 
sometimes cause reactions of a hazardous nature. Pressures used 
in testing and phenomena associated with the use of gases or 
chemicals cause the flaws to propagate until damage is effected 
to the vessel and to the surrounding environment and personnel. 

This danger can be minimized and predicted by conducting an 
assessment of the pressure vessel's fracture mechanics character- 
istics. 

4*4.3 Input Data Reculrements For Fracture Mechanics 

The Fracture Mechanics technique requires that information from 
systems specifications, diagrams and drawings, manuals, procedures, 
requirements and history for use in familiarization, evaluation 
and assessment be provided. Items of information needed include 
plain stress intensity factors and fracture toughness of the 
material, including threshold intensity level; the size and shape 
of the surface flaw; the thickness of the plate; the design oper- 
ating stress and the proof test stresses; the ultimate strenu 
of the material and yield strength; and data from the procedures 
pertaining to time and cycles. Hazard and previous failure 
experience of similar, related and interfacing systems should 
also be obtained. (See Fij - re 3-1 ) 
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4.4.4 Summary Description of Fracture Mechanics Assessment 

This section sets forth some of the criteria for the design of 
fracture resistant pressure vessels. Fracture specimen tests 
and fracture mechanics analyses shall be performed for the 
purposes of predicting critical flaw sizes at the proof and 
operating stress levels, predicting probable failure modes, 
determining allowable stress intensity ratios (i.e., filmic., 
ratios), determining allowable flaw sizes, and assisting in the 
determination of allowable design deviations. The specific 
criteria governing each of these areas are as follows: 

4. 4. 4*1 Critical Flaw Sizes 

The critical flaw sizes at the proof and operating stress levels 
shall be determined for the parent metal and weldments in all 
high stressed areas of a vessel. Where the total applied stress 
levels are below the material tensile yield strength, the 
critical flaw sizes shall be calculated using the appropriate 
stress intensity equations, the applied stress, and the measured 
plane strain fracture toughness value (K- c ). Where the total 
applied stress exceeds the material yield strength, critical 
flaw sizes shall be empirically determined using x'racture speci- 
mens which contain flaws that simulate those that can be 
encountered in the actual vessel. 

Prevention of proof test failure requires that there should be 
no initial flaws in the vessel greater than the critical sizes 
at the proof stress levels. Accordingly, if the predicted 
critical flaw sizes are smaller than the sizes which have been 
demonstrated to be reliably detectable by nondestructive inspec- 
tion, the vessel design shall be modified so as to increase the 
critical sizes. 

4.4.4«2 Failure Mode Analysis 

A failure mode analysis shall be performed for each completed 
pressure vessel design. The predicted failure mode (i.e., 
leakage or complete fracture) shall be determined at the proof 
and maximum operating conditions. Analytical and experimental 
verification that the probable failure mode is leakage rather 
than complete fracture shall be required in those cases where 
assurance of operational life is not provided by the proof test. 

4.4. 4.3 Allowable Stress Intensities 

The performance of cyclic and sustained stress subcritical flaw 
growth tests of the parent metal and weldments shall be «. require- 
ment for all icetallic pressure vessels designed for NASA. The 
resulting data shall be used in conjunction with the maximum 
expected service life requirements (i.e.* cycles, time at 
pressure, environment, etc.) to determine the allowable initial 
stress intensity, K-j j and allowable stress intensity ratio, K • 
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4* 4* 4. 3 (Continued) 

stress intensity, Kn and allowable stress intensity ratio, 
K U A 1c . Because of the major effect that test and service 
environment can have on sustained stress flaw growth every 
effort 8 hall be made to accurately simulate these environments 
in the laboratory tests. 

For thick walled vessels, the allowable initial stress intensity 
shall be the largest value which cannot attain the critical value, 
Ki c , due to cyclic and/or sustained stress flaw growth withir. 
the maximum required life span of the vessel. For both thi > 
and thin walled vessels, which are subjected to prolonged 
pressurizations, the allowable initial stress intensity shall 
be less than the sustained stress threshold value, K^. For 
vessels which normally experience only one short duration oper- 
ational cycle (e.g. , solid propellant motor cases) the allowably 
initial stres-s intensify will be allowed to exceed the tl shold 
values providing that it has been shown from experimental stress 
intensity versus time data that the initial stress intensity 
cannot reach the critical value during the operational eycle. 

The allowable K^/Ki c ratio to be used in determining the proof 
test factor (App. Dj shall be the lowest individual value obtained 
from the analysis of the subcritical flaw growth tests of welds 
and parent metal in the various anticipated service environments. 

4. 4.4.4 Allowable Flaws 

Any flaws of such size, location, ard orientation, which result 
in an applied stress intensity equal to or less than the allow- 
able initial stress intensity pt the operating stress levels, 
are allowable initial flaws for tr.e vessel as it is placed into 
service. Using a proof test based on the minimum proof test 
factor (allowable K^gAh) , the allowable initial flaw sizes will 
be equal to the critical sizes at proof stress level. To allow 
for possible flaw growth during proof testing, and thus prevent 
proof test failure, the allowable initial flaw sizes prior to 
proof testing shall be somewhat less than the critical siv.;s at 
the proof tost Is-. el. The flaw growth allowance for slow growth 
during proof testing is dependent upon the material, temperature, 
ard environment and shall be estimated from laboratory *est data. 
Nondestructive inspection acceptance limits shall be e alua.ted 
b as- . ’-pern the calculated and experimentally determined allow- 
able flaw sires. In general, those limits shal' be conservative 
enough to allow for both the uncertainties involved o.n the deter- 
mination of allowable flaw sizes and the probable tolerance on 
the c-paoility of the nonfeSiructive inspection procedures. 
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4.4.4. 5 Design Deviations 

Since design deviations such as radial and angular mismatch of 
welded joints result in increased stresses which in turn can 
reduce the allowable flaw sizes , effort shall be made to min- 
imize these deviations. The allowable design deviations for 
each vessel shall be established based on a study of the result- 
ing stresses, the effect of these stresses on allowable flaw 
size and nondestructive inspection capability. Joints contain- 
ing the established allowable radial and angular mismatch and 
containing the allowable surface flaw (on the high tension 
stressed surface) shall be able to withstand the proof pressure 
stresses without failure. 

4. 4. 4. 6 Nondestructive Inspection 

Pressure vessel weldments and parent metal shall be non- 
des tructively inspects. 1 per the applicable inspection specific- 
ations called out in tne NASA procurement specification for each 
pressure vessel design. The adequacy of the specified acceptance 
limits shall be verified based on the allowable flaw size pre- 
dictions. If tne allowaoie claw sizes (including the effect 
of design deviations) are less than the specified acceptance 
limits, the vessel design shall be modified so as to increase 
the allowable flaw sizes. The specified acceptance limits 
s hall not be made more restrictive unless t has been claarly 
demonstrated that the detection of smaller flaws is within the 
capability of the inspection procedures. 

4. 4. 4. 7 Proof Test Procedures 

4.4.4.7.1 Test Temperature 

Every pressure vessel fabricated shall, be proof tested to a 
stress level equal to or greater than (l r allowable K ii/ K^ e ) 
x the maximum operating pressure at a temperature equal to or 
less than the lowest expected operating temperature, except 
as noted below. 

Where it has been clearly demonstrated from laboratory tests 
that the pressure vessel weldments and parent metal have 
increasing plane strain fracture toughness values with decreasing 
temperature, the vessel shell be tested at a temperature equal 
to the maximum expected operating temperature. 

4. 4. 4. 7. 2 Test Fluids 

Stress intensity versus time data for the proposed test fluid- 
pressure ’'essel material combination shall be obtained prior to 
performing the proof test. If the threshold stress intensity 
is low (lower than 0.70), then an alternate less aggressive test 
fluid shall be used. 
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4. 4. 4. 7. 3 Pressurization and Hold Times 

The time required to pressurize the vessel from K>ujAi c x 
the proof pressure to the proof pressure level shall be 
the minimum possible as dictated by the capabilities of the 
selected pressurization system and shall be maintained for 
the minimum time possible. 

4.4. 4*7. 4 Depressurization Time 

The vessel shall be depressurized from the proof pressure 
level to KjhAxc x proof test level as fast as possible. 
The exact time to depressurize to this pressure level will 
depend on the flaw growth rates of material. 

4«4>4.7.5 Multiple Cycles 

The general criteria is that proof testing shall be limited 
to a single cycle except in the case where special circum- 
stances dictate the need or make it desirable to conduct more 
than one proof test. Such special circumstances include the 
following cases: 

1) A single proof test cannot be designed to envelop the 
critical operational pressure, temperature and external 
loading combinations. 

2) The vessel has been modified or repaired subsequent to 
the initial test, and therefore requires recertification 
of proof test. 

3) It is desired to extend the guaranteed life of the vessel 
after it has had a period of service usage. 

4) From an economical standpoint it is desired to test 
components (e.g. , bulkheads) of the vessel prior to 
initiating final assembly. 

5) To minimize the risk of failure at the design temperate re, 
it has been shown (by laboratory experiments on preflawed 
simulated parts or specimens) that a prior test at a 
higher temperature is advantageous. 

4. 4. 4. 8 Combined Loads 

For those pressure vessels which are critical for internal 
pressure combined with flight loads, .it may not be possible 
to envelop the operational stress levels in the vessel with 
internal pressure alone. In such cases the proof test setup 
shall include provisions to apply simulated flight loads 
combined withinternal pressure. These loads shall be applied 
'during the tes. 
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Post Proof Inspection 

While it is possible that small amounts of flaw growth may 
occur during proof testing, the vessel should not fail in 
service providing the proof test was properly conceived and 
executed. Consequently, re-inspection of the vessel sub- 
sequent to proof testing is not generally considered to be 
necessary. 
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4.5 FAILURE MODE, EFFECTS, AND CRITICALITY ANALYSIS (See Appendix E) 

4.5.1 Summary Description of Technique 

FMECA considers each functional component of a system in each 
of it's possible failed states, and deduces the effects of such 
failures on man and the hardware. Data are collected about each 
component to predict the probability that an actual failure will 
occur. The failures which have the greatest detrimental effects 
and which are relatively likely to occur are listed in a safety 
critical parts list. In this way, attention is focused on the 
parts of the system which need correction. 

FMECA* s are conducted in two steps; a Failure Mode and Effects 
Analysis (FMEA), and a Criticality Analysis (CA). The FMECA 
should be initiated at the same time that system functional 
assemblies are being designed. iAs changes to the design are 
proposed, these may be incorporated into the FMECA to determine 
the net effect on system, safety. 

4.5.2 Application of FHSCA 

Failure Mode, Effects, and Criticality Analyses (FMECA) have 
been used for determining the reliability of systems, and may 
be used to determine system safety also. A different viewpoint 
is used, however, because the goal of reliability analysis is 
somewhat different than the goal of safety analysis. The objective 
of safety analysis is to determine hazards to life and equipment, 
and the failures that cause the hazards to become damaging. 

4.5.3 I nput Data for FMECA 

Conducting FMECA* s requires that system requirements, specifics*' 
tions and drawings be gathered early. If there are trade-off 
studies completed, these should be reviewed for background in the 
design compromises being considered. Evaluation of FMECA models 
requires that large amounts of failure data are gathered and 
assimilated. (See Figure 3-1) 

4.5.4 Procedure for FMEC A 

The initial step of FMECA is the construction of a logic block 
diagram showing the functional relationships of the elements of 
the system under an..lysis, Next, each component is studied to 
determine all possible modes of failure. Each failure mode for 
each component is assured to occur ( + he only failure in the 
system at the instant being analyzed), and the possible effecus 
are traced through the system ion til the final effect is sysPu* 
damage of a predetermined amount, the injury or death of inter- 
facing personnel, or no pre^ep-ahle effect on safety. The 
critical failure modes and components which d r .feet safety are 
then studied to determine their failure histc t, When this is 
estimated, the probuuiii -.it- j *L.n, the c_aet.y .c. uc. ng affect- v 
occur through each critical component failure mode are calculated. 
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6.0 DEFINITIONS 

Definitions of particular use to system safety engineers are included 
herein. Where possible, these definitions have been taken from: 

a. NASA Publication SP-7, “Dictionary of Technical Terminology for 
Aerospace Use," 1st Edition, 1965. 

b. NASA Publication SP-6001, "Apollo Terminology," August 1963. 

c. Air Force Publication, AFSCM 127-1 "System Safety Management." 

d. NASA Publication, NHB5300.1A, "Reliability and Quality Assurance 
Program Plan, Apollo" 

e. DOD Publication, MIL-S-38130A, "Safety Engineering of Systems and 
Associated Subsystems and Equipment, General Requirements for" 

ABORT - Premature termination of a mission because of existing or imminent 
degradation of mission success accompanied by the decision to make safe 
return of the crew the primary objective. 

ACCIDENT - An undesired event occurring by chance and which causes death, 
injury or damage to property. 

ASSEMBLY - A number of parts or subassemblies or any combination thereof 
joined together to perform a specific function. 

CHECKOUT (C/0) - A test or procedure for determining whether a person or 
device is capable of performing a required operation or function. When 
used in connection with equipment, a checkout usually consists of the appli- 
cation of a series of operational and calibrational tests in a certain sequence, 
with the requirement that the r esponse of the device to each of these tests 
be within a predetermined tolerance. For personnel, the term checkout is 
sometimes used in the sense of a briefing or explanation to the person 
involved, rather than a test of that person's capability. 

C OMPONENT - An article which is a self-contained element of a complete opera- 
ting unit and which periorms a function necessary to the operation of that unit. 

C0:1P0?:E-:t /YD PART R ELIABILITY - A component or part is reliable when it will 
operate to a predetermined level of probability under the maximum ratings at 
most severe combination of environments for which it was designed and for 
the length of time or number of cycles specified. 

C0MP0HEI.T STRESS - The stresses on component parts are those factors of us-ge 
or test which tend to affect the failure rate of these parts. This includes 
voltage, power, temperature, frequency, rise time, etcj however, the principal 
stress, other than electrical, is usually the thermal-environmental stress. 
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6*0 (Continued) 

CREW - A group of ground and flight specialists who perform simultaneous and 
sequential duties and tasks Involved in the accomplishment of an assigned 
operation. 

CREW BAY - Any portion of flight hardware which will be environmentally 
controlled for crew habitation. 

CREW SAFETY - Safe return of crew members whether or not the mission is 
completed. 

CREW SAFETY PROBABILITY - The probability of flight crew return without 
exceeding prescribed emergency limits. 

CREW SAFETY SYSTEM (CSS) - Consists of the necessary sensors, test equipment, 
and displays, aboard the spacecraft to detect and diagnose malfunctions and 
to allow the crew to make a reasonable assessment of the contingency. For 
emergency conditions, the CSS is capable of initiating an abort automatically. 

CRITICAL DEFECT - A defect that judgment and experience indicate could result 
in hazardous or unsafe conditions for individuals using or maintaining the 
product or could result in failure in accomplishment of the ultimate objective. 

CRITICALITY - Assignment of relative importance to hardware or systems. 

CRITICALITY PARTS LIST - A listing of these parts whose failure would cause 
a degradation in mission success or crew safety. 

DESTRUCT - The action of detonating or otherwise destroying a vehicle after 
it has been launched, but before it has completed its course. 

DETECTION DEVICES - Sensors used to sense and monitor conditions, e.g., 
open or closed valves, temperatures, flow rates, etc. The status of the 
condition is usually displayed on control consoles, such as. Hazard Monitoring 
Panels. 

ENVIRONISNT - The aggregate of all the conditions and influence which affect 

the eper-tion of equipments ard components. 

EQUIPM ENT - One or more assemblies, or a combination of items, capahle of 
independently performing a complete function. 

EQUIPM ENT FAILURE - When an equipment no longer meets the minimum acceptable 
specified performance and cannot be restored through operator adjustment of 
controls. 

FAILURE - The inability of a system, subsystem, component, or part to perform 
its required function. 
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6.0 (Continued) 

FAILURE ANALYSIS - The study of a specific failure, which has occurred, in 
order to determine the circumstances that caused the failure and to arrive 
at a course of corrective action that will prevent its recurrence. 

FAILURE MECHANISM - The physical process which results in a part or equipment 
failure. 

FAILURE MODE - The physical description of the manner in which a failure 
occurs, the operating condition of the equipment at the time of the failure. 

FAILURE MODE, EFFECT AND CRITICALITY ANALYSIS 

.. FAILURE CRITICALITY ANALYSIS - Study of the potential failures that might 
- occur in any part of a space system in relation to other parts of the 
system in order to determine the severity of effect of each failure in 
terms of a probable resultant safety hazard, and acceptable degradation 
of performance, or loss of mission of a space system. 

FAILURE EFFECT ANALYSIS - The study of the potential failures that might 
occur in any part of a space system in order to determine the probable 
effect of each on all other parts of the system, and on probable mission 
success. 

FAILURE MODE ANALYSIS - The study of a space system and working inter- 
relationships of uhe parts thereof under various anticipated conditions 
of operation (normal and abnormal) in order to determine probable 
location and mechanism where failures will occur. 

FAILURE RATE - Rate at vhich failures occur as a function of time. If the 
failure rate is constant, it is frequently expressed as the reciprocal of 
mean- time-be tween-failure (MTBF) . 

FALL-BACK AREAS - Locations in vicinity of launch pad affording blast 
protection through use of wall, revetments and bunkers or sufficient 
distance. 

FAULT TREE ANALYSIS (LOGIC DIAGRAM ANALYSIS) - A logic oriented graphic repre- 
sentation of the parallel ana series combinations of independent personnel or 
equipment subsystem and component failure and normal operating modes that 
can result in a specified undesired event. This representation can be 
quantified to provide a relative measure of the paths leading to these events. 

FEA SIBIL ITY STURY - The phase during which studies are made of a proposed 
wr teoan^e to determine the degree to which it is practicable, 
advisable, and adaptable for the intended purpose. 

flight - (1) The movement of an object through the atmosphere or through 
space, sustained by aerodynamic, aerostatic, or reaction forces, or by 
orbital speed; especially, the movement of a man-operated or man-controlled 

d vl.e, cu’h as a rocket, a space probe, a space vehicle, or aircraft. 

(2) An instance of such a movement. 
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6.0 (Continued) 

FLIGHT CREW - The Apollo flight crew consists of three men who are cross- 
trained to be capable of manning any of the Command Module (CM) duty 
stations. The three crewmen are designated commander, navigator, and systems 
manager. The CM commander is also the Lunar Excursion Module (LEM) commander. 

FLIGHT MISSION - Within a project, the specific technical or scientific 
objective to be accomplished by a given launching of a space vehicle or 
launch vehicle. 

FLIGHT TERMINATION SYSTEMS - Devices or means for ending flight of space 
vehicle, e.g., propellant tank rupture, ordnance and explosive separation 
devices, etc. 

GROUND OPERATIONAL SUPPORT SYSTEM (GOSS) - The equipment , excluding the 
launch vehicle, spacecraft, and launch complex, required to be in operation 
for direct support of the mission being accomplished. This equipment 
shall include that used to provide or support mission control, guidance and 
navigation, tracking, telemetry, communications, logistics, and recovery 
operations. 

GROUND SUPPORT EQUIPMENT (GSE) - That equipment on the ground, including all 
implements, tools, and devices (mobile or fixed) required to inspect, test, 
adjust, calibrate, appraise, gage, measure, repair, overhaul, assemble, 
disassemble, transport, safeguard , record, store, or otherwise function in 
support of a rocket, space vehicle, or the like, either in the research and 
development phase or in an operational phase, or in support of the guidance 
system used with the missile, vehicle, or the like. 

The GSE is not considered to include land or buildings; no? loos it include 
the guidance-station equipment itself, but it doe3 include the test and 
checkout equipment required for operation of the guidance .oat ion equipment. 

HAZARD - A source of danger or risk. 

HAZARDOUS CONDITION - A situation involving risk of injury to personnel or 

damage to property. 

YA ZL P.H0 Tr f rgg&SI £N - Specific operation involving risk, 

FPI.D-FTRE - An interruption in the countdown previous to ignition for lift-off. 

I UDUST R IAI, SAFETY - The safety of individual and independent manufacturing 
rrc .e: .res an.i ir.u-’strial materials, equipment, and facilities. Industrial 
Safety is also that organization which creates and administers safety require- 
ments pertinent to manufacturing or industrial operations, protective equip- 
ment, and emergency procedures and equipment. The sufe'-y requirements created 
by Industrial Safety result from: direct observation of industrial activities, 

accident statistics, bio-medical studies, and equipment and material 
!•; •• -if: -■ v .1 ous. 
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6,0 (Continued) 

INTEGRATED SAFETY PROGRAM - A eafety program for assembly, checkout, test, 
and operation at the Launch Center. This program promotes exchange of Infor- 
mation and incorporates safety criteria in procedures and operations that 
have been developed at other Centers and contractors. 

INTERFACE - The junction points or the points within or between systems or 
subsystems where matching or accommodation must be properly achieved in 
order to make their operation compatible with the successful operation of 
all other functional entities in the space vehicle and its ground support. 

LAUNCH COMPLEX - That area which contains the space vehicle launching 
facilities, including the launch pad and servicing structures, the control 
buildings or blockhouse, propellant transfer equipment, support building, 
and all other facilities in the immediate vicinity required to support a 
space vehicle launch or lies within the prexaunch hazard area. 

MAINTAINABILITY - The quality of the combined features of equipment design 
and installation that facilitates the accomplishment of inspection, test, 
checkout, servicing, repair, and overhaul with a minimum of time, skill, 
ana resources in the planned maintenance environments. 

MAINTENANCE - The function of retaining material in or restoring it to a 
serviceable condition. 

MISSION - The objective, task, or purpose which clearly ’indicates the action 
to be taken. 

MISSION ANALYSIS - A comprehensive evaluation of all the parameters which 
affect the events of a mission. 

MISSION OPERATIONAL SAFETY - The essential safety qualities, considerations, 
and criteria necessary for a safe mission. 

MISSION PROFILE - A graphic or tabular presentation of the flight plan of a 
spacecraft showing all pertinent events scheduled to occur. 

MISSION SUCCESS - The attainment of all or a major part of the scientific 
objectives of the flight with no crew injury or los3 of life. It has some- 
times beer defined as a safe return of all three astronauts from a completed 
lunar landing mission. 

MISSION TASK - The specified purpose for which a device must perform. 

MODULE - (1) A self-contained unit of a launch vehicle or spacecraft which 
serves as a building block for che overall structure. The module is usually 
designated by its primary function as command module, lunar landing module, 
etc. (2) A one-package assembly of functionally associated electronic parts, 
usually a plug-in unit, so arranged as to function as a system or subsystem; 
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6.0 (Continued) 

MODULE (Continued) - a black box. (3) The size of some one part of a rocket 
or other structure, as the semidiameter of a rocket's base, taken as a unit 
of measure for the proportional design and construction of component parts. 

OPERATING TIME - The time period between turn-on and turn-off of a system, 
subsystem, component or part during which time operation is as specified. 
Total operating time is the summation of all operating time periods. 

OPERATIONS SAFETY ANALYSIS (OSA) - An orderly examination of specified 
operations (or tasks) with the purpose of identifying significant hazards 
generated by that operation (i.e., people/machine interface). Each OSA 
includes those features nr preventive measures necessary (Requirements) to 
eliminate or preclude identified hazards. 

OUTGASSING - The release of gasses (when pressure drops) that are entrapped 
in materials . 

PAD SAFETY - That portion of space vehicle safety concerned with vehicle 
operation in th6 area of the launch pad. This includes the exercising of 
precautionary measures on fixed vehicle facilities, ground handling gear on 
the pad, and the vehicle itself to the point of lift-off, 

PART - (l) One of the constituents into which a thing nu; be divided. Appli- 
cable to a major assembly, subassembly, or the smallest individual piece in 
a given thing. (2) Restrictive. The lease subdivision of a thing; a piece 
that functions in interaction with other elements of a thing but is itself 
rot ordinarily subject to disassembly. 

PUBLIC SAFETY - The protection of life and property of people in or close to, 
but not associated with the whole area of the range. 

QUALIFIED MATERIALS - Materials and articles that _by_ determination of tests 
and examinations of documents and processes verify that materials and 
articles are capable of meeting performance requirements. 

RANGE - Space which 3 s utilized no conduct a launching operation. The Range 
epi.ce for in-flight phase of sp-ce vehicle ceases at orbital injection and 
will vary acccriir.j the requirements and characteristics of individual 
space vehicles and is specifics] ly defined for each mission. 

RANGE SAFETY - The process of minimizing hazards to persons or property 
attendant to space vehicle operations and associated activities. Range 

Safety includes Pad Safety and Flight Safety. 

RANGE USF R - **> agency having an overall management of a program requiring 
t..e use l : Test. Range facilities in support of space vehicle operations. 

IliSii is a hange User. 
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6.0 (Continued) 


REDUNDANCY - The existence of more than one means for accomplishing a giver 
task where all means fail before there is an overall failure to the system 
(NPC 250-1). 

Parallel redundancy applies to systems where both means are working at the 
.same, time to accomplish the task and when either of the systems is capable 
of handling the jcb itself, in case of failure of the other system. Standby 
redundancy applies to a system where there is an alternative means of 
accomplishing the task that is switched in by a malfunction sensing device 
when the primary system fails. 


RELIABILITY - Of a piece of equipment or a system, the probability of 
specified performance for a given period of time when used in the specified 
manner. 


RELIABILITY ASSESSMENT - An analytical determination of numerical reliability 
of a system or portion thereof without actual demon' ration testing. Such 
assessments usually employ mathematical modeling, u~a of available test 
results, and some use of estimated reliability figures. 

SAFETY - Freedom from those conditions which can cause injury or death to 
personnel, damage to or loss of equipment, or property. 


SAFETY CHECKLIST - A listing for verifying safety aspects of equipment, 

pre secures, and operations. 


SAFETY DATA - Recorded knowledge for reference or application in safety and 
accident prevention field. This includes internal and external directive 
and procedural information, and safety criteria generated internally and 
externally such as reports, studies, summaries, panel, and committee minutes. ; 

'SAFETY SURVEILLANCE - Observation of designated hazardous ^dangerous operations 
by a safety representative to insure adherence to safety principles, and com- 
pliance with operating plans and procedures, technical data, safety directives 
and checklists. 


SPACE SYSTEM - A system consisting of launch vehicle, spacecraft, ground 
supnert equipment, and test hardware used ill launching, operating, and 
maintaining the vehicle or craft in space. 

SPACE VEHICLE - A launch vehicle and its associated spacecraft. 

SUBSYSTEM - A major functional subassembly or grouping of items or equipae:'. o 
which is essential to operational completeness of a system. 

SIS'-'K'.'. - (1) Ary organized arrangement in which each component part acts, 
reacts, or interacts in accordance with an overall design inherent in the 
arrangement. (2) Specifically, a major component of a given vehicle such 
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6.0 (Continued) 

SYSTEM (Continued) - as a propulsion system or a guidance system. Usually 
called a major system to distinguish it from the systems subordinate or 
auxiliary to it. 

The system of sense 1 may become organized by a process of evolution, as in 
the solar system, or by deliberate action imposed by the designer, as in a 
missile system or an electrical system. 

In sense 2, the system embraces all its own subsystems including checkout 
equipment, servicing equipment, and associated technicians and attendants. 

When the term is preceded by such designating nouns as propulsion or guidance, 
it clearly refers to a major component of the missile. Without the designating 
noun, the term may become ambiguous. When modified by the word major, however, 
it loses its ambiguity and refers to a major component of the missile. 

SYSTEM SAFETY - ihe optimum degree of safety within the constraints of 
operational effectiveness, time, ana cost attained through specific appli- 
cation cf system safety engineering throughout all phases of system develop- 
ment and utilization. 

SYSTEM SAFETY ENGINEERING - A.i element of systems management throughout the 
program life cycle involving the application of scientific, engineering and 
management principles for the timely identification of those actions 
u ces-ary to prevent or control hazards within the system. 

TEST - (1) A procedure cr action taken to determine under real or simulated 
conditions the capabilities, limitations, characteristics, effectiveness, 
reliability or suitability of a material, device, system, or method. (2) A 
similar procedure or action taken to determine the reactions, limitations, 
abilities, or skills of a person, other animal, or organism. 

WARNING DEVICES - Sensors that monitor or detect conditions and provide 
visible and/or audible alerting signals as desired for selected events. 

ZERO-G CHARACTERISTICS - The reaction or change in behavior of a substance 
cr system introduced in^c ar. eJ.viror.xent free of eravitational forre. 
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APPENDIX A 

Gross Hazards Analysis 
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Gross hazards analysis is a comprehensive, qualitative, non- 
mathematical hazard assessment of a product or system. 

The use of gross hazards analysis allows an early assessment 
of the inherent safety of the completed system. Early design 
changes, and early procedure changes which are made to eliminate 
or control hazards minimize costly modification after the system 
is built. The gross hazards analysis is accomplished in steps 
as follows: 

1) Identify all gross hazardous events, 

2) Prepare functional flows for fault event analysis, 

3) Evaluate functional flows for fault events or hazards, 

4) Make design change recommendations, 

5) Evaluate all procedures for hazards, 

6) Prepare safety procedures as necessary, 

7) Evaluate all proposed charges, 

8) Make design change recommendations on changes, 

9) Make procedure change recommendations on changes. 
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2.0 APPLICATIONS 

The gross hazards analysis technique is applicable to complete 
systems or programs, or to major segments of a system or program, 
where it is necessary to identify safety critical areas, identify 
the hazards involved, establish the controlling design criteria 
that will be used and provide recommendations for hazard elimina- 
tion or further hazard analysis. The gross hazards analysis allows 
program management to define the system safety task for the life of 
the program and plan for manning and budgeting as well as to estab- 
lish goals and priorities. 
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3.0 INPUT DATA REQUIREMENTS 

Data useful for gross hazard analysis studies would Include 
the following: 

1) Requirement specifications 

2) System specifications 

3) Detail specifications 

4) Flow diagrams 

5) Schematic diagrams 

6) Installation drawings 

7) Detail drawings 

8) Operations and maintenance manuals 

9) Technical operating procedures 

10) Test and checkout procedures 

11) Test requirements 

12) Standards 

13) Waivers and deviations 

14) Safety codes, procedures and regulations 

15) Failure reports 

16) Critical parts lists 

17) Analyses of similar systems 
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4.0 PROCEDURE FOR GROSS HAZARDS ANALYSIS 

l) Operations; 

A. Identify all gross hazardous events. Known safety critical 
areas are identified first using existing design guidelines 
such as: 

1. Company Standards 

2. State Codes and Regulations 

3. Advisory Codes 

4. Range Safety Guidelines. 

Considerations in this hazardous events identification would 
include but not be limited to: 

1. Propellants (fuel, oxidizer, mono, solid) 

(a) Characteristics 

(b) Hazards - (Personnel, system) 

(c) Handling Requirements 

(d) Storage Requirements 

(e) Transportation Requirements. 

2. Explosives 

(a) Hazard Classifications 

(b) Characteristics 

(c) Handling Requirements 

(d) Storage Requirements 

(e) Transportation Requirements. 

3. Pressure Piping and Vessels 

4. Other energy sources in the system. 

5. Environmental constraints 

(a) Radio Frequency Fields 

(b) Temperature requirements 

(c) Pressure requirements 

(d) Vibration requirements 

(e) Crash worthiness requirements 

(f) Rescue, Egress and salvage requirements. 

6. Operator and Maintainor Human Factors and 
Training Requirements. 

7. Material compatibility 

8. Maintainability. 

9. Emergency capabilities 
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4*0 (Continued) 

Other areas where hazardous conditions are less immediately 
obvious will require separate analysis and investigation to 
identify all critical areas. 

B. Prepare functional flows for fault event analysis. Major flows 
might be as follows in a manned flight system. Each major event, 
system, operation or facility should be identified in the flow. 

1. Mission events critical to crew/equipment safety 

2. Critical systems 

3. Critical operations (manufacturing) 

4. Critical operations (test) 

5. Critical facilities. 

C. Evaluate functional flow diagrams for fault events and hazards. 

1. Mission events critical to crew/equipment safety. 
Events such as the following should be examined to 
identify potential hazardous conditions. 

(a) Ground to vehicle power transfer 

(b) Stages firing and separation 

(c) Launch escape sequence 

(d) Ground control and communication 

(e) In-flight operations and tests 

(f) Re-entry 

(g) Recovery. 

2. Critical Systems 

Systems such as the following should be examined to 
identify potential hazards. 

(a) Explosives 

(b) Propellants 

(c) Power sources 

(d) Pressure systems 

(e) Life-support 

(f) Propulsion 

3. Critical Operations (Manufacturing) 

Operations, such as the following, should be examined 
to identify potential hazards. 

(a) Toxic or reactive materials 

(b) Welding 

(c) Cleaning 

(d) Handling 

(e) Fabricating, Forming, Machining 

(f) Assembly. 
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(Continued) 



4. 

Critical Operations (Test) 




Operations, such as the following, should be 
examined to identify potential hazards. 




(a) Qualification and Proof Tests 




(b) System Functional Tests 

(c) Explosive Tests 

(d) Transport and Handling 

(e) Static Tests 



5. 

Critical Facilities 

Facilities, such as the following should be examined 


* 


to identify potential hazards. 




(a) Pneumatic 

(b) Propellant 

(c) Assembly 




(d) Ordnance 

(e) Special Test 

(f) Environmental 

(g) Launch 

(h) Manned Item Support. 


D. 

Make design change recommendations. 

For each fault or potential hazard, a suitable permanent 
solution should be proposed for review by design authorities. 



In 

some instances a temporary work-around proposal may be 



ne' 

:essary to allow further study of a permanent fix. 


£. 

Evaluate all Procedures for Hazards. 



1. 

Installation 



2. 

Operations 



3. 

Maintenance 



4. 

Test 



5. 

Emergency. 


F. 

Prepare Safety Procedures as Necessary. 



1. 

Explosives Control Procedure 



2. 

Confined Spaces Entry Procedure 



3. 

Radioactive Material Control Procedure 



4. 

Toxic Propellant Control Procedure 



5. 

Toxic Materials Control Procedure 



6. 

Radiographic Operations Procedure 



7. 

Flammable Liquids Control Procedure 



8. 

Pressure Systems Control Procedure 



9. 

Material Disposal Procedure 



10. 

Emergencies - Medical - Fire - Explosion 
Other 



11. 

Other special area procedures. 


O 


z 

o 

-J 


< 

2 

Z 

U1 


a 

* 


o. 

> 


o 

u. 

uj 

« 

D 


O 


SHEET A -403 




U * 4P 0 J 1 434 * f V , 1-83 






V •. • .**i 


NUMBER D2-1 19062-1 
REV ITR 


Gi 


(Continued) 

G. Evaluate All Proposed Changes 

As system is modified, redesigned, or updated, the gross 
hazard analysis of each change should be performed veil 
in advance of change implementation. 

H. Make Design Change Recommendations On Proposed Changes. 

I. Make Procedure Change Recommendations On Proposed Changes. 

2) Documentation of Analysis 

Documentation of a gross hazard analysis can take several 
forms. It should be a working document and may include: 

(a) A list of safety critical systems 

(b) Explos?.ve components list 

(c) Radioactive components list 

(d) Corrective action list 

(e) Work-around action list. 

A worksheet useful in summarizing the hazardous condition 
or conditions, the hazard category designation, and 
recommendations for action to be taken, including further 
analysis, for each safety critical item may be patterned 
after the sample worksheet shown in Figure A1. 

CONCLUSIONS 

Gross hazards analysis is generally considered to be a rapid 
analysis method which will identify areas of concern from a 
gross standpoint which may then be further analyzed by a more 
detailed qualitative and/or quantitative technique. 
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Appendix B 
Part I 

Operations and Test Safety Analysis (OSA-1 ) 

1.0 INTRODUCTION 

! 

The Operations and Test Safety Analysis (OSA-rl) method identifies 
operations that are inherently hazardous or, which by the nature 
of the function sequences, can lead to development of hazards in 
the operation of a system. This method can be usee in all aspects 
of system operation from construction to mission termination. 

The objective of performing OSA’s is to ensure that hazards, 
existing or developing during s particular task, are identified, 
dLcumented and brought to the attention of the proper authorities 
for resolution. Such hazards may result from the task itself, or 
from interaction oi other work being done concurrent! ■" with the 
task. The OSA’s will include corrective action recommendations 
which serve to eliminate these hazards, or reduce them to an 
acceptable level „ Each task is leviewed and the reasoning for a 
particular safety requirement is recorded to substantiate program 
decisions. 

Each task (act, process, or test) rhall be analyzed individually 
to ensure complete investigation of all situations requiring safe- 
guards, special equipment, or specific instructions (e.g., cautions, 
warnings, or verifications) to avoid personnel injury or signif- 
icant equipment damage. Previous analyses of hazards in specific 
areas of operation should be used to the maximum extent. The 
following method provides a means of accomplishing a comprehensive 
analysis of each task. 

The results of OSA’s, specifically safety requirements for each 
task, can ■ be used as either direct input to the detailed pro- 
cedures for the task, or san provide a baseline for criteria 
standards, manuals, or handbooks against which the detailed 
procedure is written. 

Data useful for Operations and Test Safety Analysis would 
include the following: 

l) Test and Checkout Plan and Test Requirements 


I 
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2) Test and Checkout Procedure* 

3) End-to-End Schematics of Test Equipment and Item 
Being Tested** 

4) Installation Drawings of Test Equipment 


*NOTE 1 : 


A useful method of organizing this data is to establish 
a matrix of the equipment components that must be operated 
and monitored versus the test steps. Each step has require- 
ments as to the configuration of the hydraulic valves, 
electrical switches or mechanical positions. The safety 
engineer can then analyze the hazards involved should any 
element not be in the required mode. See Figure B.l. 


**N0TE 2: 


Caution should be observed to ;ensure that schematics reflect 
all details of the as-built equipment. 



Component 


Valve AAV #1 


Power on Buss #1 


Latch #3 


REQUIRED TEST STATE 
2 I 3 I 4 I 


Closed 

Etc. 

On 

Etc. 

Latched 

Etc. 

Open 

Etc. 



— TEST REQUIREMENTS DATA ORGANIZATION 
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reference documentation 

— — =■ ■ 1 i 

1 Apollo Program Directive 
APD No. 33 

2 Apollo Program Directive 
APD No. 31 

3 Apollo Program Directive 
’APD No. 26B 


4 Document No. D2-1 17019-1, 
March 1-68, The Boeing Co., 
Contract NASW 1650 

5 BSD Exhibit 66-22, 

March 1 , 1967 
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Center Responsibilities 
in the Apollo Program 

Apollo System Safety 
Program Requirements 

Preparation of Test and 
Checkout Plans and 
Procedures at KSC 

Guidelines for Operations 
and Test Safety Analysis 


Safety Engineering Analysis 
for Field Activities, WS-133 
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ANALYSIS METHOD 


WORK SHEET 


01 


The actual analysis may be prepared on a work sheet as shown 
in Figure B2. It can be prepared in long hand by the analyst 
and retained for reference. The work sheet should include 
the following: 

3*1.1 Task Column 

This column is used to itemize the tasks required to complete 
the operation or test being analyzed. It should evolve from 
an examination of every act, function, and associated equipment 
that is a part of the operation. If new procedures are added 
by the safety requirements they will also be entered in this 
column, then analyzed for existing or potential hazards. 

In dividing the operation into distinct tasks, the separation 
must be sufficiently explicit to ensure complete visibility of 
possible hazards. The task description should include, where 
appropriate, a brief statement of the function or effect of the 
operation within the system. Each task will be identified by 
numbers as shown in Figure Bo. 

3.1.2 Hazard Column 

The Hazards Column contains a description of the hazardous con- 
ditions that are revealed by examination of the procedures. It 
also includes hazards known to exist, although they may already 
have been resolved. To aid in the search for hazards, identify 
energy sources and energy transmissions. Use appropriate sequence 
numbering to correlate the hazards with the correct steps of the 
procedures (Figure Bo). Appropriately indicate those procedural 
steps in which no hazard can be found. Explain hazards as fully 
as possible. The question*: what, where, when, how, and why will 

be answered as applicable. The analyst should consider possible 
human errors during normal operations and maintenance. Emergency 
situations should be considered to ensure that such conditions 
can be mitigated. 

3.1.3 S afety Requirements Column 

List requirements in procedures, processes, material, or equipment 
necessary to reduce, or eliminate, the identified hazard(s). If 
additional tasks are generated by these requirements (Safety 
Requirements), they can be added to the Task Column. Each of the 
new tasks must be examined to determine if they create new hazards 
and subsequent safety requirements: andntory sequence of tasks 

resulting from the analysis can be described in this column. 

If sequencing becomes too complex or confus'ug, a safety sequence 
chart should be d*r. jlcjo i <-o show the prescribed sequence of op.v.h;o- 
from a safety standpoirt. See Figures B3 and B4 for symbols and a 
sample "Mandatory Gaiety Sequence Chart", respectively. 

SHEET BI-301 






USE FOR TYPEWRITTEN MATERIAL ONLY 


THt 


COMPANY 


NUMBER D2-1 19062-1 
REV ITR 


) 


! 


3.1.4 Justification Column 

Pertinent information such as data calculations, standards, 
ideas, and concepts leading to the identity of a hazard, and 
the subsequent development of safety requirements are listed 
in the Justification Column. 

Information sources used to determine that a hazard exists 
and to develop safety requirements must be recorded. This 
column can list background and reference data such as 
material specifications, compatibility factors, and logic 
methods used in arriving at a particular conclusion. 

з. 2 HAZARD DETERMINATION 

Tasks from procedures requirements will be reflected in the 
Task Column of the OSA. Each of the detailed tasks will be 
examined to determine functional and nonfunctional relation- 
ships with associated equipment, test components, operators, 
maintenance personnel, and the system as a whole. Based on 
the elements of each task, any action producing an event or 
effect that would be detrimental to the system will be identified. 
This could be developed in general terms of energy control. The 
analyst will look for such things as uncontrolled, or misuse of 
mechanical, electrical, electro-magnetic and chemical energies. 
Springs, levers, pulleys, power supplies, radar antennas, pro- 
pellants and acids are typical of the many sources of injury to 
personnel, or damage to equipment, (gee Section 4, -Page BI-401). 

Specific safety requirements will be established to illustrate 
the need for removing , or effectively reducing, the effects, 
or potential effects, of uncontrolled energies. 

и. 3 SAFETY SEQUENCE CHARTS 

Development of a Safety Sequence Chart allows easy communication 
of safety requirements to the operations planning groups. The 
Sequence Chart further provides a baseline analysis which can be 
efficiently modified when task objectives are changed, or when 
identification of new hazards indicates that new operational 
requirements are desirable. 

The safety requirements shown on the Sequence Chart can be 
indicated on the analysis report sheets in the "Requirements" 
column and cross referenced for identification on the chart. 

Description of the tasks to be accomplished can be found in the 
test requirements documentation and in the test and checkout 
plan. If the analysis is conducted late in the operations planning 
phase, draft test and checkout procedures can provide more inform- 
ation about the equipment involved, and will reflect those safety 
require:. jnts alix-vay established. 


SHEET Bl-302 





Figure B2 







USE FOR TYPEWRITTEN MATERIAL ONLY 



t 


•>r 


▼ hi JS company 


NUMBER D2-119>-2-1 

REV LTR 


3.3 (Continued) 

The Safety Sequence Charts can be developed after all of the 
tasks are defined, and the required sequence/ parallel accomplish- 
ment is based on a knowledge of the hazards in the equipment used. 

3.3.1 SYMBOLOGY FOR SAFETY SEQUENCE CHARTS 

EXAMPLE NUMBER 1 


Operations that may be performed in 
any sequence, but not concurrently: 

EXAMPLE NUMBER 2 


Operations which may be performed 
concurrently, or consecutively: 


EXAMPLE NUMBER 3 


Operations which must be 
performed concurrently: 


EXAMPLE NUMBER 4 

Operations which must be 
performed in a mandatory 
sequence: (All operations 

prior to an arrow must be 
accomplished before pro- 
ceeding to next operation.): 

EXAMPLE NUMBER 5 


Example 5 is a combination 
of examples 2 and 4: 


• Block 1 must be accomplished before Block 2. 

• Block 3 must be accomplished before Block 4. 

• Blocks 1 and 3 may be accomplished concurrently or in any 
sequence. 

• Blocks 2 and 4 may be accomplished concurrently or in any 
sequence. 

• Block 4 may be accomplished before Block 1 „ 

• Block 2 may be accomplished before Block 3. 


Figure B3A 



Step A 


Step B 


Step A |— 

Step B 


Step C 


Step A 
Step B 


Step A 


Step B 


Step C 
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3.3.1 (Continued) 

EXAMPLE NUMBER 6 

Tasks vhich have no safety sequencing 
requirements me*' be shown as dashed lines: 


l 1 

• Step X I 


EXAMPLE NUMBER 7 


If there are alternate tasks that may be performed to accomplish 
the same functions, each may need different safety requirements. 
This may be represented symbolically by: 



NOTE: Sequencing requirements must be shown but all possible 

acceptable sequencing need not be noted. 

Figure B3B 


3.3.2 ANALYSIS REPORTING 

The analysis report may be typed on a form similar to the work 
sheet excluding the justification column. It should include, 
however, a correlation column comprised of a notation of where 
the safety requirement was documented. 

Each safety requirement, resulting from the analysis should be 
provided to the responsible organization before the test so that 
it can be properly entered in the appropriate document. Inclusion 
should be identified in the correlation, column as step XX of 
XX-XXXX. If a particular safety requirement is rejected, the 
Correlation Column should state the reason for its rejection and 
be forwarded to the center safety office. 


i. 
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3.4 EXAMPLE OF METHOD (KSC) 

The Test and Checkout Requirements document provides the test 
title and a very brief description of each test. It includes 
equipment effectivity and pertinent notes advising of certain 
cautions that must be observed. 

The test checkout plan contains an integrated test sequence flow 
chart shoving the overlap, if any, that will occur between the 
various tests. In the example, (Figure B5 and B6) the Space 
Vehicle Cutoff and Malfunction Test for AS-503 does not overlap 
with any preceding or subsequent tests. The T&CO Plan lists each 
of the tests that will be conducted under this plan by test number 
(V--0021 ) , stage contractor responsibility code (contractor name), test 
title (Space Vehicle Cutoff and Malfunction Test), and by the test 
catalog sheet revision (Rev. A). 

The task column of the OSA sheet will be filled in from the Test 
and Checkout Plan sheet(s), functional flows, drawings, and spec- 
ifications. Each Act, procedure, or task will be analyzed to 
determine the possibility of personnel injury or property damage. 

Each hazard will be described in detail. The safety requirements 
will tell which action trust be taken to prevent the occurrence of 
the listed hazard. This column will include specific note, caution 
and warning citations deemed necessary for direct input to detail 
procedures. 

NOTE: A pictorial diagram(s), if available, will be included as 

applicable in each analysis to define the location(s) of the opera- 
tion or task being analyzed. 

The final analysis sheets (Figure B6) will be formally documented. 

An Operations and Test Safety Analysis (OSA) will contain: 

1 ) Title Page 

Includes analysis number, operation title and signature for 
preparation and approval; 

2) Active Record Sheet 

Includes a list of every page in the document with proper 
identification of added, revised, and deleted pages; 

3) Revision Sheet 

Will be blank on initial release. Includes a record of added, 
revised, and deleted pages with a notation telling why change 
was made. Each re.-ision will require the initials of approving 
individual. 
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3.4 (Continued) 

Table of Contents 

Includes contents of document plus a list of tables, figures, 
and charts. All tables, figures, and charts will be assigned 
a figure number beginning with "1" and follow consecutively 
through the document. Figures are added with subsequent revisions 
will be: a .1, .2, following the preceding figure number (e.g., 

3.1, 3.2, 3.3) ' 

Analysis will include: 

Introduction (Figure B£.) 

Scope , 

Summary of Analysis 

Ref: Test and Checkout Plan Sheet(s) (Figure B&) 

Test Sequence Flow Plan 
Source Material 

Operations Sequence Requirements 

Equipment (or operation) Location Charts (Figure B£. ) 
Analysis Sheets (Figure B6) 

A dicument number system will be established at each MSF Center. 

If numbering systems exist, they will be used as applicable. 
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. SAFETY ANALYSIS OF SPACE VEHICLE CUTOFF AND MALFUNCTION TEST - APOLLO/SATURN 

1.0 

INTRODUCTION 

1.1 

SCOPE 


This document contains the technical safety analysis of test 
No. V-20021 , Space Vehicle Cutoff and Malfunction Test, developed 
by (name of organization performing analysis) on (date). 

1.2 

ANALYSIS SUMMARY 


This summary shows the most important safety requirements developed 
in this analysis. They must be implemented before the test. 
(Describe the effects on the test if requirements are not met. 

If none, so state.) 

2.0 

REFERENCES 

2.1 

TEST AND CHECKOUT PLAN 

2.2 

TEST SEQUENCE FLOW PLAN 

2.3 

EQUIPMENT LOCATION CHARTS 

2.4 

SOURCE MATERIAL 

3.0 

OPERATIONS SEQUENCE REQUIREMENTS 


These are the sequence requl„ •- - » ts which result from 
the safety analysis. 

4.C 

ANALYSIS f ’EETS 


Examnle - Operations Analysis Format 
FIGURE B5 
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Figure B6A 
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SAFETY ANALYSIS GUIDE 
GENERAL 

The following guide, containing hazards to be considered during 
the analysis of a task, is only a partial listing and represents 
the type of areas to be questioned. It is not practical to 
attempt a comprehensive list of all possible conditions or hazards 
attendant for a given test before completing the analysis. The 
prime factor in accomplishing an operation and test safety analysis 
is knowledge of -the equipment involved and its relationship to the 
surrounding equipment or system, 

REPRESENTATIVE CONSIDERATIONS FOR OSA 

1) Consider special safety barrier requirements for modification 
work} 

2) Determine grounding or disconnection requirements for work on 
electrical/electronic equipment; 

3) Determine that operation in one area, or on one item of equip- 
ment, will not create or induce a hazard in another area, or 
on associated items of equipment; 

4) Consider special or additional lighting requirements for 
modification work; 

5) Consider need for special personnel protective clothing and 
equipment (e.g., safety harnesses, breathing apparatus, or 
goggles) ; 

6) Consider all hazards associated with welding operations (e.g., 
transient currents, electrical interference, fire and air 
contamination) ; 

7) Consider the need for special ventilation requirements for 
personnel working in closed area, oxygen deficient conditions, 
or in contaminated air. (e.g., inside, tanks, or performing 
painting, welding, or cleaning operations; 

8) Consider dancers associated with personnel working in proximity 
to high voltage; 

9) Consider the need for backup power when working on primary 
power source; 

10) When drilling or chipping concrete, investigate the possibility 
of contacting r damaging embedded pipe or conduit; 
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4*2 (Continued) 

11 ) Determine the probability of any task restricting egress 
from the work area by blocking passageways or doors with 
equipment; 

12) Investigate hazards associated with installation and iemoval 
of explosive ordnance devices and electrical connection to, 
or disconnection from, ordnance devices; 

13) Consider the need for special retest instructions; 

14) Consider the need for special entry/exit procedures; 

15) Ensure that provisions have been made to communicate with 
personnel in isolated areas; . 

16 ) Review requirements for warning placards; 

17) Consider safety precautions to be observed by personnel working 
on or around exposed electrical equipment; 

18) Consider the hazards involved when personnel are working 
around caustic, poisonous, or cryogenic materials; 

19) Establish special precautions for connecting or disconnecting 
cables ; 

20) Consider electrical interference hazards stemming from use 
of electrical powered tools; 

21) Consider the effects of status monitoring, or communications 
interruptions ; 

22) Determine if special procedures are required to prevent 
Induced faults when working on primary power equipment and 
switchgear; 

23) Consider requirements for equipment isolation when working 
on electrical or electronic power equipment. 
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Appendix B 
Part II 

OPERATIONS SAFETY RESEARCH 


1.0 LINEAR PROGRAMMING 

1.1 INTRODUCTION 

Linear Programming has had a wide variety of uses, but a common 
characteristic for all has been the optimum allocation of limited 
resources to accomplish a defined objective. The optimal com- 
bination of operations minimizes cost, period of performance, 
system output errors, number of operations required, number of 
operators required, and is least likely to cause system damage 
or personnel injury. The resources used to operate a system can 
be allocated so as to optimize system safety. 

1 .2 DESCRIPTION OF THE LINEAR PROGRAMMING METHOD 

Linear Programming is a mathematical model which describes a 
characteristic of a system. For system safety engineers, this 
characteristic is operational safety. Use of this method 
requires that all mathematical functions in the model must 
either be, or closely approximate linear, or be closely approx- 
imated by linear functions. Use of the model allows the pro- 
gramming, or planning, of activities to obtain the optimum 
level of safety. 

Linear programming is generally divided into six steps: 

1. Define the measure of effectiveness, 

2. Construct the model, 

3. Evaluate the model for optimal- results, 

4. Test the model and it's solution, 

5. Define the controls to ensure optimum results, and 

6. Assure that controls are implemented. 


O 
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1.2.1 


1.2.2 


Measure of Effectiveness 


The operational safety problem may be stated in two ways; 

(a) The degree of safety may be chosen, in which case the 
solution of the math model should be maximized, (b) If risk 
is chosen as the measure of effectiveness, the solution of the 
linear model must be minimized. Note: For the discussion 

that follows, risk will be assumed as the measure of effectiveness.* 

Construction of the Linear Model 

It is necessary to find the values of the variables x-j, x 2 , Xj ... 

Xjj which minimize the function of risk 

R * c i X 1 + c 2 x 2 + ••• °n V 

Where x* could be the hazard associated with each resource consumed, 
and Ci is the increase in r for each unit of x^. 

Constraints on the variables take the form of inequalities 


a 1 1 x 1 +a 1 2 X 2 + * • * a in x n kj 
a 21 x i +a 22 x 2 + * * ,a 2n x n b 2 
a m'i x i +a m2 x 2 + * • • +a mn x n — \ 


xi -El 0, x 2 


x n^°* 


The limits b-j, b 2 , ... b_ can be the total available resources 
for the achievement of the task objective. This could be total 
manpower, pounds of propellant, electric power generation capa- 
bility, etc.. The coefficients a^, a^> ••• a mn aTe un ** ,s 
of each resource consumed by each unit of hazard . For example, 
aji could be the BTU's per pound of propellant, TNT explosive 
energy equivallency per pound of propellant, or amperes avail- 
able at man-machine interfaces per watts of power available at 
the test equipment. The specific units of a^j depend on the 
hazard, x^, and the resource bj. 


*Each time the system is operated, there are two possible 
outcomes. One is that the tasks are performed without any 
equipment damage or personnel injury. The other outcome 
may be that some injury or damage occurs. The probability 
of safe performance (i.e., no damage, etc.) is P(S), and 
the probability of an accident i3 P(A). P(A) is the risk, 
and P(S) = l-P(A). 
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1.2*3 Evaluating The Model 

The most common method of solving linear programming problems is 
the Simplex Method. To illustrate this method, assume the linear 
model, 

Z = 3x^ + 5x2 

with constrictions, 

X 1 <4 
x- <T 6 

4 C — 

3x 1 + 2 x 2 < 18 
x-) >0, x 2 >0. 

The possible values of (x-|, X 2 ) coordinates are shown below. 



The shaded area represents all possible combinations of x-| , and x 2 

w'.i^h sat's V the ine ; alities x-) <T 4 and X2< b. 
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1.2.3 (Continued) 


Adding the maximum of the constraint 3x-| + 2x 2 
the shaded domain shown below. 


< 18 yields 


X2=6 6 

5 



\ \ 


V\\ N 


4 


\ \ 


\\\\ ' 

\ x x\ '■ v 

x x x x •- .\ 

12 3 t. 5 S' 


8 9 


Figure B-8 - Maximum Value 
The maximum value for the objective function, 

Z = 3x, + 5 x 2 

exists in this domain, and could be found by trying some values 
for Z. If Z is 20, the line 20 = 3x 1 + 5x 2 lies well inside the 
domain, and there are many pairs {x^ , x 2 ) which satisfy the con- 
straints a. d tie o\e ;ti'-’e fur tion. Z must be higher in value. 
Ti e option value will have or.lv one pair (r-j, x 2 ) which will 
solve the linear function. 
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1.2.3 (Continued) 



The value of Z which is the optimum is 36 = 3x. , + 5x2? and 
x-j = 2, xo = 6 are the desired values for the input variables 
which will produce the optimum. 

It is feasible to use the graphic approach for linear program 
solution with up to three decision variables, x-j , Xj, and xy 
Most objective functions will have more than three variables and 
the solution can be found by use of a computerized Simplex Method. 
The solution by computer is more complex than illustrated in the 
above example? however, most texts on Operations Research will 
provide the details of determining the optimal solution by means 
of this method. 

1.2.4 Testing The Model 

Test the particular linear model and the optimal solution that has 
been determined to ascertain if it predicts safety or risk for each 
alternative combination of operations with sui'ficient accuracy to 
permit valid decisions. If at all possible, use historical data for 
the system, under study to simulate past operations which have known 
outcomes (i.e., accidents, incidents, or safe operation). Compare 
these outcomes with the results using the linear model with the 
historical data substituted into the objective function. Much care 
should Ve exercised to assure that the constraints derived for the 
system at present were true when the historical data was generated. 
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1.2.5 Controls 

Define the controls on the system operation vhlch the linear program 
Indicates have a bearing on optimizing safety. Controls may take 
the form of safety standards or safety operating criteria. The 
requirements that certain operations must occur in series , in some 
ordered sequence, or concurrently form controls which can optimize 
safety. 

1.2.6 Assurance of Control Implementation 

When tys terns managers impose the recommended controls, monitor the 
system operations to determine that they do in fact tend to reduce 
risk. Review of accident and incident reports before and after the 
controls were implemented may be helpful. Direct communication with 
the system operators is virtually essential throughout an entire linear 
programming analysis, and is especially beneficial during the assur- 
ance phase. 
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2.0 NETWORK ANALYSIS 

2.1 INTRODUCTION 

Network Analysis has been applied very successfully for increasing 
the efficiency of manufacturing processes, decreasing the h andling 
and shipping delays encountered in product distribution systems, 
and maximizing the probability of meeting program schedules. The 
method is very general and fundamental to the simulation of systems 
or combinations of operations. Applications may be possible for 
system safety analysis if analogies can be made between appro- 
priate system characteristics and the concepts of flow and path 
length. For example, the object of an emergency egress system 
is to evacuate as many people as possible in the shortest time 
possible, and in the safest possible way. The latter objective 
considers the vulnerability of the escapees to the accident created 
environment (heat, pressure, etc.) as well as the inherent safety 
of the egress system in use. The analysis of such an egress system 
would require three networks: one to maximize the flow of people; 

one to minimize path lengths from work stations to the defined safe 
area: and one to minimize vulnerability of the escapees within the 
constraints of each possible accident in the work area. The opti- 
mum network must then be chosen, using the method of Linear Pro- 
gramming if necessary. 

The following paragraphs will summarize the network model and three 
uses of the method to optimize flow, path length, and path alter- 
natives. 

2.2 GRAPHIC MODEL 

The representation of the real system or set of physical operations 
used in Network Analysis is a graph consisting of junctions, called 
"nodes" and connection lines called "branches". The junctions re- 
present functional points in the system and the branches indicate 
the existing interfaces or interdependencies of the functional 
points. If a flow is associated with each branch, the graph is 
considered a "network". In the graph example the junctions are 
circles and the branches are the interconnecting lines. A "chain" 



Figure 3-10 - GRAPH EXAMPLE 
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2.2 (Continued) 

is a series of nodes and branches that connect ^ach - ir of nodes. 
For example, one possible chain between 1 ar . & is (x (2,4), 
(4,6), (6,8) or the reverse (8,6), (6,4), (4,2), (2,1;. If a 
direction of flow through the chain is specified, it is called a 
"path". A chain connecting a node to itself is termed a "cycle". 

A graph for which -every pair of nodes are connected through a 
chain i3 called a "connected graph". A connected graph which does 
not contain any cycles is a "tree". One graph theorem states that 
a graph containing n nodes is connected if it has (n-1) branches 
and no cycles. Such a graph would also be a tree. A branch is 
"directed" if a sense of direction is associated with it so that 
the node at one end can be considered a source and the node at the 
opposite end can be interpreted as a sink. A connected graph in 
which all branches are directed is a "directed graph". If a 
directed graph is a network, the direction is assumed to be the 
feasible direction of flow in each path. A network is net directed 
if flow can occur in both directions along one or more paths. Ihe 
"capacity" of flow is the maximum feasible flow in one direction. 
Capacity can be any non-negative number from zero to infinity. 

If capacity in one direction along a path is zero, the branch is 
directed. If all paths connected to a node are directed away from 
the node, it is a source. If all of the connected paths flow into 
the node, it is a sink. 

2.2,1 Maximum Flow Problems 

Consider a network with a source at one end and a sink at the other, 
anw assume no loss of flow at each intermediate node. The ooject 
is to determine the feasible steady state flow pattern which maxi- 
mizes the flow from the source to the sink. 



MAXIMAL FLOW PROBLEM 
Figure B-1 1 
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2.2,1 (Continued) 

The flow capacity is indicated for each path by the node from which 
the flow enters the path. For example, the flow from 1 to 2 can 
be 7, but the flow capacity from 2 to 1 is zero. Hie solution of 
the network is accomplished by the iterative process of assigning 
and reassigning a feasible flow for each chain from the source to 
the sink until the positive flow capacity has been used in each 
chain. The total flow obtained this way will be optimal, but is 
not necessarily the only optimal flow pattern. 

One possible flow in the example is 3 along the chain 1, 2, 4 , 7. 
Since only net flow through a path is significant, it is possible 
to assign fictitious negative flows in the reverse direction. The 
remaining capacity in each path of the chain is found by decreasing 
..Jthe positive flow capacity on each path by the assigned flow valuo 
of the smallest capacity along the chain. The example then becomes 
the network shown below. . . 



NETWORK WITH A FLOW OF 3 THROUGH 1, 2, 4,&7 
Figure B-12 

Assign a flow of 7 through 1, 3, 5, 7; a flow of 2 through 1, 2, 5, 
4, 7j a flow of 2 through 1, 2, 6, 7; and a flow of 3 through 
1, 3, 6, 7. Hie resulting network is optimal in this case, since 
the total capacity of the sink, 17, is assigned. 



RESULTING NETWORK WITH A TOTAL FLOW OF 17 


Figure B-13 • | 

I 
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2.2,1 (Continued) 


This is a special case of the "max-blow min-cut" theorem which 
states that, for any network with a single source and sink, the 
mfl-H mim feasible flow from source to sink equals the minimum cut 
value for all the cuts of the network. A minimal cut is shown 
below. From the theorem, the value of any cut provides an upper 
bound to the flow, < .a the least upper bound would then be the 
maximum possible flow. 



• min 


cur */7 


NETWORK WITH MINIMUM CUT SHOWN 
Figure B-14 

Had the minimum cut been recognized at the beginning, the solution 
process could have Let n shortened, and each chain would not have 
to be worked out. 

When networks become complex, it is desirable to shorten the 
solution by use of the computer. This may be done by programming 
the computer to sum successive cuts through the network until the 
minimum cut is found, cr by having the computer solve the feasible 
chains and assign flows until no positive flow capacity is left 
in the network. 

A correlation to the emergency egress problem may be made in which 
the source 13 the location of the escapees at the time of the alarm, 
lhe network represents the alternate routes that the people may 
choose, and the sink may be the point at which a safe environment 
is available. This problem closely represents an escape situation 
where medical or rescue teams most stay together during escape. 
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2.2.2 Minimum Rath Problems 

Consider the connected network shown below in which the length of 
each branch is known. The object is to determine the shortest 
route from the origin to the terminus. 


W W 

Figure B-1 5 - Minimum Path Network 
The shortest method of finding the minimum path is to start at the 

origin and successively select the shortest paths to the adjacent 
nodes in ascending order of their distances. When the terminus 
is reached, the shortest path should be identified. 

The distance from node to node is shown below in tabular form. 

NODE 0 A B C D E 


BRANCH- OA-7 AD-6 BE-4 CD-2 DC-2 EB-4 FD-2 GC-3 HB-6 

LENGTH 

OB-8 AB-7 BD-6 CF-3 DF-2 HJ-6 FC-3 GF-5 HG-8 

AC-8 BA-7 CG-3 DA-6 ED-7 FG-5 GD-6 HT-8 

CA-8 DB-6 EG-9 FT-9 GH-8 
DG-6 GT-8 

DB-7 GE-9 

Figure B-1 6 - Distance Node to Node 

Step 1 : The shortest distance to the closest adjacent node is 

7 to A. Circle OA-7, write 7 over A node's column. 
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(Continued) 

cross out the branches leading to A, as shown below. 


BRANCH- (PA-?) AD-6 BE-4 CD-2 DC-2 EB-4 FD-2 GC-3 HE-6 

LENGTH 

OB-8 AB-7 BD-6 CF-3 DF-2 EH-6 FG-3 CP-5 HG-8 

AC-8 T CG-3 ED-7 FG-5 GD-6 HT-8 

C DB-6 EG-9 FT-9 GB-8 

D3-6 GT-8 

• ' v ■****■ . -DB-7 GB-9 

Figure B-17 - Step 1 

Step 2: The candidates for the next nearest nodes to A and 0 

are B and D. The comparison of distance from 0 yields 
8 for B and 13 for D, so select B. Circle OB-8, write 
8 above B node's column and cross out all branches 
leading to B. Circle the node column when all choices 
have been considered. 


BRANCH- (0A-7 ) AD-6 BE-4 CD-2 DC-2 £>< FD-2 GC-3 HE-6 

LENGTH 

(03-8) XBC BD-6 CF-3 DF-2 EH-6 FC-3 GF-5 HG-8 

AC-8 £fr<r CG-3 D ED-7 FG-5 CD-6 HT-8 

Bft-S DB-6 EG-9 FT-9 GH-8 

DC-6 GT-8 

DE-7 GE-9 

Figure B-18 - Step 2 

Step 3: Candidates for nodes closest to 0 and B are D and E. 

The shortest route from 0toDis7 + 6 = 13 through A, 
and the distance to E from 0 is 8 + 4 = 12 through B. 
Select E and change one list as below. 
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2«2. 2 (Continued) 

7 8 12 



BRANCH- (QA-jJ) AD-6 (bE- 4) CD-2 DC-2 £B< FD-2 GC-3 38<r 
LENGTH 

(QB-8) BD-6 CF-3 DF-2 EH-6 FG-3 CF-5 HG-8 


AC-8 CG-3 ED-7 FG-5 GD-6 HT-8 

; ' CM IK EG-9 FT-9 GH-8 

DG-6 GT-8 

j»sr ds<t 

Figure B - 19 - Step 3 

Step 4: The distance to D from 0 through A is 13 and through B 

is 14, and from 0 to H through E is 12 + 6 = 18. 

Select D because it is the closest to both E and 0. 

(G is not a candidate because of the length 9 from EG 
and the length from 0 to G compared- to 0 to H of' 

0 to D). 

7 8 , 13 12 

NODE Qa^CDEFGHT 

BRANCH- (£A££) (aB-cQ yoE^4j C&*2T DC-2 FD-2 GC-3 S&4. 

LENGTH 

(03-8) B CF-3 DF-2 EH-6 FC-3 CF-5 HG-8 

AC-8 CG-3 2&G FG-5 HT-8 

BB<« EG-9 FT-9 GH-8 

DG-6 GT-8 

D&-7 GE-3 

Figure B-20 - Step 4 

Step 5: Candidates for new nodes closest to both D and 0 are 

C, F, and H. The distance to C from 0 is 7 + 8 = 15 
through A. The shortest distance to D from 0 has been 
shown in step 4 to be 13, so the distance to C and F 
through D is 13 + 2 = 15 in both cases. The shortest 
distance to H is through E. The distance OH is then 
12 + 6-18. Nodes C and F are equidistant, so select 
both. Use the chain OAC or OADC since the distances 
are equal. The modified table is shown below. When 
looking at C cross cut ail paths into C, other than 
from A ir D u'.'.i when looking at F cress out paths 
to it otner than from D, 



1 


SHEET BII-207 




USE FOR TYPEWRITTEN MATERIAL ONLY 


2*2.2 (Continued) 


8 15 13 12 15 


BRANCH- (OA-7? (aD-6 XbE-4 n (DC-2) ES< JS< OX 

LENGTH 

(OB-ff) 3©*9 38< GF< CVF^2) EI-6 J50C3 X HG-8 
. (AC^ ftKT CG-3 ttftf X FG-5 HT-8 

6*^ EG-9 FT-9 GH-8 


OB<T 


J2&& 


Figure B-21 - Step 5 

Step 6: New nodes closest of 0 end C are F and G. Path CF 

has been eliminated in step 5, but G is still a 
candidate. The distance to G from 0 through C is 
15 + 3 = 18, and through D is 13 + 6 = 19. The 
path from 0 to H through E has not yet been elimi- 
nated, and it ties with the other OACG path at 
12 + 6-18. Because of the equality select both 
node G (through C) and node H. 


8 15 13 12 

(£) (D) <£ 


18 18 

G H 


BRANCH- (OA^(a£D (b|^ C&< (SgD EB< G6*C B8-S 

LENGTH ^ 

(03-B) JDtnT 3D-S CF<3 (PF-2) (ggg) FC-tf Q*=5 fiCK 

(AC^B) (CGQ) ES-tf E0>4> HT-8 

FT-9 GH-8. 


Step 7: 




Figure B-21 - Step 6 

Consider nodes F, G, and H. The next new node is T, 
the terminus. The distances through F, G, and H to 
T are 15 + 9 = 24 for Fj 18 + 8 = 26 for Gj and 
18 + 8 - 26 for H. The shortest path is, therefore, 
through F. The final table appears below. The 
minimal path through the network is identified and 
is 0,A,D,f ,T. 
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(Continued) 
NODE @ 


7 8 15 13 12 15 18 18 

(*)(5)®©®®<s)(a) 


BRANCH- fOA-7) ( aD^ 6) (f&^ fi®*? (PC^2> S»C 
LENGTH 1 

(flED Mt (§B><S3> »6-S 

OS) feK 

£*< Mr £s=s (IS) 


DM 1 Bfictf 
CS« 8«C 
fl»< 8X 
®WL 
t»W5. 


24 

<2 


££<* &B*S. 

Figure B-23 - Step 7 

3he correlation of the minimum path network to the emergency escape 
problem depends on the assumption that the egress rate (or the 
velocity of the escapees) is the same for all paths. The objective 
is to select the shortest, and, therefore, the fastest path to the 
safe place at the terminus. The escape rate may not be equal for 
all paths. In this case, use time instead of distance to select 
the quickest path, which may not be the shortest in distance. 

2.2.3 Minimum Spanning Tree 

A variation of the Minimum Path Problem is the selection of the 
minimum path for a tree connecting all nodes. This tree could be 
used during the design of an egress system to assure the optimum 
placement of egress equipment relative to the work locations of 
personnel. As an example network, refer to the one used in this 
appendix in section 2.2.2. If there are some constraints to the 
selection of routes of egress, these should be defined at the 
start of the analysis. A typical constraint may be the flow 
capacity along each branch. Another constraint may be the degree 
of vulnerability of the escapees in each route relative to likely 
accident induced enviren-cnts. To simplify the solution explana- 
tion, no constraints will be considered. 


The minimal spanning tree can be determined in a straightforward 
manner. Beginning with any node, the first step Is to pick the 
shortest possible branch to an adjacent node. The second step is 
to find the new node which is closest to either of the two connected 
nodes and add the appropriate branch. This process is continued 
until all nodes have at least one branch connecting them to the tree. 
The resulting network derived in this way is a minimum spanning tree. 
Further, the first node selected has no bearing on the resulting 
tree, if branch length is the only variable. If constraints must 
be considered, orientation or certain node pairs may need to be 


directly conne^ce... In 


0-0^ u to tiCO- w ' 


or..’ 
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2,2.3 (Continued) 

to the network, and then solve for the minimum spanning tree in 
the remaining portion of the network. 



EXAMPLE MINIMUM SPANNING TREE 
Figure B-24 

Using the example from section 2.2.2, the minimum spanning tree 
connecting all nodes appears as above. This represents the 
smallest total branch length that will connect all nodes, had 
the path DC been precluded from choice by some constraint, the 
branch CF would have been used to connect C into the network. 
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1.0 FAULT TREE ANALYSIS FLOW 

1.1 ANALYSIS ACTIVITY 

The following problem solving steps are considered essential for 
a systems approach to safety. These steps will enable the risk 
of undesired (hazardous) events identified in the system to be 
maintained at an acceptable level. Starting with the System 
definition and information pertaining to the system configuration, 
then the steps are: 

1) Identification of undesired events; 

2) Structuring identified undesired events into a fault tree; 

3) Determination of fault inter-relationships; 

4) Evaluation for "likelihood" of identified undesired events; 

5) Tre de-off decisions and/or corrections. 

As depicted in Figure Cl , steps 1 ) and 2) above are necessary to 
develop what is commonly known as a "Top" logic diagram. The top 
logic diagram plays an essential part in performing a system safety 
fault tree analysis. It is a starting guide which shows how and 
where the fault tree is to be developed (or expanded) by further 
analysis activity. It organizes all of the system unique logic 
relationships into a pattern whereby the system hardware and soft- 
ware functions can be analyzed in an orderly and logical manner. 
This means that the top must be structured so that the end analysis 
is complete in satisfying what is defined by the top undesired 
event(s). 

System unique logic relationship variables which must be care- 
fully structured are things such as: a) system operation modes, 

b)mission phases and/or operations, c) the degree of man/machine 
relationship in the system, d) inter-relationships of the Centers 
with the system functions, and e) functional order of the system. 

This list of relationship variables covers the top structure 
gross considerations, and indicates the types of activity involved. 
The system unique logic relationship variables will vary with the 
different systems being analyzed, with the degree of difference 
depending upon the similarity between systems. 

As already stated, the top logic diagram is a starting "guide" 
for a complete system fault tree analysis. This means that once 
the top is started it is not necessarily "cast in concrete", but 
is subject to change as analysis activity progresses. Experience 
has shown that as an analysis proceeds to completion, more 
system information and understanding is gained. As system inform- 
ation and uncerc landing develop, modification to the top logic 
diagram is required to reflect this current knowledge. 
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1.1 (Continued) 

Step 3) is the actual development of the logic diagram. This 
is the point where analysis activity proceeds from the top 
logic diagram structure and continues through the hardware level. 
This step is the foundation of a fault tree analysis. The fault 
mode relationships, once correctly and completely structured, 
will usually never change - unless hardware design changes occur. 

Step 4) is an evaluation of the completed fault tree for the 
purpose of: a) determining the likelihood of identified events, 

and b) determining the identity and ranking of "chains" of events 
and event relationships leading to the identified undesired 
event(s). Evaluation can be accomplished by rigorous mathematical 
processes (quantitative evaluation) or from intuitive (inductive) 
methods. However, the results obtained (quantitative/inductive) 
will only be as complete as the applied rigor. Useful results 
can be obtained from evaluations made during the course of 
development of the fault tree analysis. 

Should a quantitative evaluation be "equired, an equation can be 
written for the entire fault tree. By use of Boolean algebra, 
Lambda Tau methods or Monte Carlo methods the equation can be 
simplified and solved to give a meaningful solution. Except for 
very small trees, the use of a computer is required. See the 
list of references for sources of information on employing these 
mathematical solutions. 

Step 5) If it is determined through the evaluation of the fault 
tree (or as a result of other analyses) that corrective action 
is required, the fault tree analysis itself is a valuable source 
of information for change decisions. Proposed corrections such 
as design changes, procedure changes, etc., can be evaluated in 
the context of the fault tre9 to determine a relative measure of 
improvement. 

In order to achieve a meaningful and useful analysis, two 
important points must be emphasized. First, the output of an 
analysis is only as valuable and reliable as the quality and 
quantity of effort and information going into the analysis. 

Second, hardware and operating procedures configuration control 
must be maintained at all times to avoid erroneous conclusions 
being drawn from the analysis. 

1.2 PROCRAM ACTIVITY 

The Fault Tree technique can be used to perform a complete 
system-integrated analysis, or for a small problem containing 
less thrn ten events. In any case the flow sequence of analysis 
will follow the outline to some degree as described below. 
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1.2 (Continued) 

The flow of activity necessary for a complete system- integrated 
fault tree analysis should follow a pattern as shown in 
Figure C2. This flow takes into consideration the steps required 
to perform an analysis, along with the difficult task of con- 
solidating the event analyses into one complete system/mission 
oriented analysis. 

As shown in Figure C2, the first step in the analysis program 
development is the structuring of the top logic diagram. After 
a suitable top has been structured and agreed upon by all involved, 
each of the analysts is assigned specified portions of the fault 
tree for further development. While the analyses are being con- 
ducted, the task of reviewing the output of each analysis and 
combining the output into one complete systems analysis is per- 
formed by those who developed the top diagram. When the analysis 
for system safety is complete, it will be documented. 

An important factor necessary in accomplishing a system-integrated 
analysis is effective communications on a "day-to-day" basis 
between all the analysts involved. 

1.3 FAULT TREE 

The following guidelines may be used to achieve a consistency 
of approach and to assure analysis completeness. 

1) Structuring should follow the rules and symbolism used in 
this appendix, since they are well standardized throughout 
the aerospace industry. 

2) Each "diamond" event should have the following information 
and reason for analysis termination of the event: 

(a) Insignificant (with rationale), or 

(b) Lack of system information, or 

(c) Identification of other analyses which satisfactorily 
analyze the failure modes and system effects for that 
event. 

3) Development information sources should be identified by 
schematic, flow, time, mechanical, electrical, operation, 
maintenance drawing and/or document numbers. The revision 
date and/or number must be included for each source. This 
source information must be included as part of each submittal. 


SHEET C -104 



USE F0» TYPEWRITTEN MATERIAL ONLY 







USE FOR TYPE' ilTTEN MATERIAL ONLY 


' * . ■ ^ f V 

«*• ;»**&■**■• * <•«» 

•' 


b-„ ' 

■&T€ - 


NUMBER D2-1 19062-1 
REV LTR 


(Continued) 

4) Each analyst must utilize the fault tree alphabetic code 
assignments made in the computer drawing program, if one 
is being used. 

5) Revision codes should be included by each analyst and can 
be based on the standard practice of assigning progressive 
alphabetic characters beginning with A. 

6) Identify all components and subsystems by part number. 


Drawing the Tree 

In some cases, the analysts may make hand sketched trees, and 
document the evaluation and conclusions. In other cases, where 
more complice ted trees are involved, and presentations to sub- 
stantiate the conclusions must be made to management, then 
formal drafted trees may be prepared. Where complicated integ- 
rated systems are being analyzed, there are computer controlled 
drafting systems available. See the list of references for 
sources of information on tnese systems. 
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2.0 FAULT TREE PROCEDURE 

2.1 GBUBAL 

A fault tree Is a diagram of the logical relationships of parallel 
and series combinations of independent personnel or equipment sub- 
system and component failures and normal operating modes that can 
result in a specified undesired event. This diagram can be Quanti- 
fied to provide a relative probability of causing the specified un- 
desired event by means of each path leading to that event. Paths 
having high relative probability are considered dominant over paths 
of low probability. 

The following sections discuss basic rules, definitions and methods 
of the fault tree technique. 

2.2 EVENT DESCRIPTION 

The term "event" denotes a dynamic change of state that occurs to 
a system element, where an element is inclusive of hardware, software, 
personnel and environment. If the change of state is such that the 
intended function of the particular element is not achieved, or an 
unintended function is achieved, the event is an abnormal system 
function or "fault event.” If the change of state is such that the 
intended function occurs as planned (designed), the event is then a 
normal system function or "normal event.” Thus, two types of events 
exist — those which are not intended and those which are intended. 

Fault events can be divided into two categories: basic events and 

gate events. Basic events are events whereby system elements (usually 
at the component level) go from an unfailed state to- a failed state, 
and are related to a specific failure rate and fault duration time. 
These events are used only as inputs to a logic gate (never as out- 
puts) and are therefore independent events. On a fault tree, basic 
events are depicted by a circle or a diamond. A gate event is the 
event (or system failure) which results from the output of a logic 
gate. Since the gate event is dependent upon the input events and 
the type of logic gate function, it is therefore a dependent event. 

It must be noted that the gate event is not the logic gate itself, 
but the result of the logic gate function and the input events. The 
gate event is depicted by a rectangle above the logic gate. As fault 
tree development progresses, gate events on one level become inputs 
to gate events on the next higher level. (See Section 2.3 for 
examples. ) 

In the fault tree analysis of a system the inherent modes of failure 
of system elements are delineated as primary, secondary and command. 
These failure modes are referred to as "primary events," "secondary 
events," and "command events" respectively, and are depicted on the 
fault tree as the combination of basic events and/or gate events. In 
other words, these events cure generally identified at a gate event 
level, and depending on the level of analysis, are further developed 
ur.tdl the event can be identified in terms of basic events. 
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(Continued) 

In a fault tree analysis, the dynamic change of state that occurs 
to a system element is defined as a binary type event. That is, 
a system element is always in one of two states, ON or OFF. The ON 
state (or l) corresponds to a failed condition and the OFF state (or 
0) corresponds to an unfailed condition. The example below illus- 
trates the binary manner of a system element. The element operates 
normally (OFF state) until failure occurs (ON state). After the 
fault event occurs (dynamic change of state) the element remains 
failed (ON state) until repair of some sort has been effected. When 
repair is accomplished, the element returns to the unfailed state 
{OFF). By representing events and gates in a binary manner, fault 
trees can be analyzed by the rigorous techniques of Boolean algebra. 

Event Duration Time Event Duration Time 


STATE OF 
ELEMENT 


OFF 0 



A - Time of 1st failure 
B - Time 1st failure is repaired 
C - Time of 2nd failure 
D - Time 2nd failure is repaired 


SYMBOLS 


Rectangle 


The rectangle identifies an event (gate event) that results from 
the combination of fault events through a logic gate. The rectangle 
is also used to describe a conditional input to a functional condi- 
tion INHIBIT gate (desc ribed below) i 


Circle 

The circle describes a basic fault event that requires no further 
development. The frequency and mode of failure of items so identified 
is derived from empirical data. The rate of occurrence of such a 
primary event is normally the generic failure rate of the component 
for the particular failure node. 
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2.3 (Continued) 


House 


The house indicates an event that must occur (cr is expected 
to occur) due to normal operating conditions in the system, -he 
house does not indicate a fault event. \n example is a prase 
change in a dynamic system, such as the landing, flight, and t- Ice-off 
phases of an aircraft. 


Diamond 


The diamond describes a fault event that is considered basic in a 
given fault tree. The possible causes of the event are not developed 
either because the event is of insufficient consequence or the 
necessary information for further development is unavailable. It also 
can indicate non-development because an analysis already exists that 
is of satisfactory depth and breadth. Which of the three uses that 
applies, should be indicated for each diamond on the tree. 



Oval 

The oval is used to record the conditional input to a random condition 
INHIBIT gate. It defines the state of the system that permits a 
fault sequence to occur, and may be either normal to the system or 
result from failures. It is also used to indicate the necessary 
sequence of events required to pass through an "AND" or an "OR" gate 
function. 
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2.3 (Continued) 

Double Diamond 

The double diamond is used in the simplification of a fault tree 
for numerical evaluation. The event described results from the 
causes that have been identified, but are not shown on a particular 
version of the fault tree being examined. 
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"AND" Gate 

The "AND" gate describes the logical operation whereby the co- 
existence of all input events is required to produce the output 
event. The fault duration time of an "AND" gate is expressed in 
terms of the input fault duration times. 


O 


ft 
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Ui 
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Example of "AND" Gate Usage: 


Power 

Source 


r 

t 



B 


Circuit 



FAULT TREE 
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(Continued) 

Another example of "AND" Gate Usages 


Light "C" 
On 





Switch^ 

Jlosedy 


Circuit 


FAULT TREE 


"OR" Gate 

The "OR" gate defines the situation whereby the output event 
will exist if one or more of the input events exists. The 
fault duration time of an "OR" gate is expressed in terms of 
the input fault duration times. 


- Output 


■2 or more Inputs 


SHEET C-205 


« I 4991 1494 *t V. 9-49 














USE FOR TYPEWRITTEN MATERIAL ONLY 



NUMBER D2-1 19062-1 

uWJFfAfJF COMPANY 

REV LTR 


(Continued) 

"PRIORITY AND" G&te 

The "PRIORITY AND" gate performs the same logic function as the 
"AND" gate with the additional stipulation that sequence as veil 
as co-existence is required. 


Output 


Priority- 

Description 


'2 or More Inputs 


"CONSTANT FAULT DURATION AND" Gate 

The "CONSTANT FAULT DURATION AND" gate symbolized describes the 
same logical function as the "AND" gate except that the fault 
duration time of the output event is not dependent upon the fault 
duration times of the inputs. The fault duration time of this gate 
is determined as a function of the system operation. 


Output 


TT 


Fault 
' Duration 
Jlrae 


2 or More Inputs 
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(Continued) 


of "CONSTANT FAULT DURATION AND" Gate Usages 


Consider the undesired event "Rocket Motor Inadvertently Ignited." 
Assume the "armed" results in a warning light prompting immediate 
repair action. If the "armed" event occurs and the warning system 
is working, the fault duration time is one unit. If the "armed" 
event occurs and the warning system has failed, the fault duration 
time is naturally. longer, being dependent upon how often the monitor- 
ing system is functionally checked. 


Rocket Motor 
Inadvertently 
Ignited 


Safe-Arm 

Mechanism 

Arced 


Ignition 

Current 

Present 


Missile 

Armed 

A 


Missile 

Armed 

A' 


A p 1 Unit Fault Duration 
Time 

A'= 1 Unit Fault Duration 
Time + Monitoring 
System Functional 
Check Time 


Fa\ut 

Duration 

Time 


(£ Monitoring 
System & Check 
Time) 


.1 .' **• 





Safe-Arm 

Missile 

Mechanism 

Armed 

Monitor System 

A 

Failure 

A 

FAULT TREES • 
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"EXCLUSIVE OR" Gate 

The "EXCLUSIVE OR" gate functions as an OR gate with the restriction 
that specified Inputs cannot co-exist. This gate will not respond 
to the co-existence of Two or more specified input events. 


Output 


Restriction 


2 or More Inputs 


Example of "EXCLUSIVE OR" Gate Usage; 


Assume trie 
Thrust 



Assume: Twin, side mounted engine vehicle. 


Loss 

of Engine 
\ No. 1^> 


./Not Both N 
Simultaneously 


/ Loss 
of Engine 

\No. 2 . 


"CONSTANT FAULT DURATION OR" Gate 


The "CONSTANT FAULT DURATION OR" gate performs the same function as 
the "OR" gate except that the fault duration time of the output event 
is not dependent upon the fault duration times of the inputs. The 
fault duration time of the output event is strictly dependent upon 
system operation variables, and must be determined from system 
information rather than in terms of the input event fault duration 
times. 


— Output 


Fault 

Duration 

Times 


2 or More Inputs 
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2.3 (Continued) 

"INHIBIT" Gates 


"INHIBIT" gates describe a causal relationship between one fault 
and another. The input event directly produces the output event 
if the indicated condition is satisfied. The conditional input 
defines a state of the system that permits the fault sequence to 
occur, and may be either normal to the system or result from failures. 
The conditional input is represented by an oval if it describes a 
specific failure mode and a rectangle if it describes a condition 
that may exist for the life of the system. The conditional input 
is further described on the following pages. The logical "INHIBIT" 
functions are symbolized in fault trees as follows: 



"FUNCTIONAL CONDITION INHIBIT" Gate 

The "FUNCTIONAL CONUTION INHIBIT" gate provides a means for 
applying conditional probabilities to the fault sequences. If the 
input event occurs and the "condition" is satisfied, an output event 
will be w snerated. The duration time of the output event may be 
either the duration time of the fault input or may be separately 
generated. 



SHEET C-210 


B 


1 B a • ii * » v 



NUMBER D2-1 19062-1 
REV LTR 


V 


• 


\ 


V 


.. K» 

4 .. , 

'*** r "T ' 3 

■*r 


COMPANY 


O 


2.3 (Continued) 

Example of "FUNCTIONAL CONDITION INHIBIT" Gate Uaage: 




"RANDOM CONDITION INHIBIT" Gate 

The "RANDOM CONDITION INHIBIT" gate is the same as the "FUNCTIONAL 
CONDITION INHIBIT" gate except that the status of the conditional 
input to a "RANDOM CONDITION INHIBIT" gate is variable while it 
remains constant in the "FUNCTIONAL CONDITION INHIBIT" gate. The 
fault duration time of the output event is always generated within 
the gate. 



Example of "RANDOM INHIBIT" Gate Usage: 
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2.4 SPECIAL SYMBOLS 

2.4*1 "MATRIX" Gate. Introduction 



The "MATRIX" gate is used to describe a situation in which an 
output event is produced for certain combinations of events at 
the inputs. A matrix shoving the event combinations that produce 
the output event accompanies each usage of this symbol. 


Example of "VARIABLE TYPE MATRIX" Gate Usage 
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"CONDITIONAL MATRIX" Example 


Plane Phases 

1. Yaw 

2. Roll 

3. Pitch 


Airplane 

Crashes 


Airplane 

Faultr 


Faults Causing 
Rudder to Jam 


Faults Causing 
Aileron to Jam 


Faults Causing 
Throttle to Jam 
on High rpm 


Pitch 
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2.4*2 Introduction to Advanced Concepts In the Usage of the Matrix Gat e 

In fault tree analysis of systems and subsystems many fault events 
are used repeatedly in order to denote the proper sequence of logic 
leading to an undesired event. Frequently the redundant fault events 
are related to one another by a second fault event, resulting in a 
unique combination of events. When these combinations are expressed 
by conventional fault tree techniques, the result is usually long 
and repetitive. The Matrix Gate is a method by which fault tree 
diagram construction is simplified with reference to permutations 
of redundant (or similar) fault events. 

It must be emphasized that the Matrix Gate is not a unique logic 
operator in fault tree analysis techniques. The Matrix Gate is 
merely a simplified or abbreviated representation of an already 
existing portion of a fault tree; the existing portion of a fault 
tree being a series of two-input AND gates (with related inputs) 
summed together by an OR gate. 

Whenever the Matrix Gate is used it is accompanied by a matrix, 
whose elements are the redundant (or similar) fault events. Ibis 
matrix is necessary in order to denote which combination of events 
are applicable to the analysis, the total number of combinations, 
and the probability of a particular combination resulting in the 
undesired event. 

In order for the Matrix Gate to meet all possible situations it is 
necessary for two types of gate to exist; the variable type Matrix 
Gate and the conditional type Matrix Gate. The variable type gate 
handles situations where both of the inputs to the gate consist of 
fault events (fault events being referred to as variables). The 
conditional type gate handles situations where one input consists 
of fault events (variable) and the other input consists of condi- 
tional evants. 

Example 1 (Figure C3) is a generalized case using the variable 
type Matrix Gate. Fault events A1, A2, A3 and A4 are unique but 
similar and fault events B1 , B2, B3 and B4 are unique but similar. 

The Boolean Expression derived from the sample fault tree agrees 
with the Boolean expression extracted from the Matrix Gate and its 
associated matrix. 

2.4. 2.1 Variable Type Matrix Gate 

Example 2 (Figure C4) is a typical problem in which a four-wire 
cable is to be analyzed. The wires are identified as A1 , A2, A3 and 
B. t’ndor standby operating conditions, assume that none of these 
wires carry voltage, and furthermore, that wire 13 is an ordnance 
line and wires A1 , A2 and A3 carry voltage at certain discrete time 
intervals. The undesired eveno is wire A1 , A 2, or A3 shorting to 
wire B and at the same time having voltage on it from a fault 
condition at the voltage source. 

i 

J 
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2.4.2. 1 (Continued) 

In this example the events which cause wire A1 to short to wire B 
will be similar to the events which cause wire A 2 to short to wire B 
and wire A3 to short to wire B. For example, they could be shorts 
caused by an insulation failure or a primary wire failure. There- 
fore, the fault conditions of these three wires are unique, yet 
similar. Since they are similar, they are drawn only once wfth the 
Matrix Gate, instead of three times under conventional techniques. 

The fault events which allow power onto wire A1 may or may not be 
similar to the events which allow power onto wire A2 or A3, depending 
upon the circuitry involved. If the fault events are similar (or 
the same) the Matrix Gate can be utilized easily, with the fault 
event drawn only once. However, if the fault events are completely 
different for each wire, the Matrix Gate becomes more complex, and 
each distinct fruit event must be drawn (with little saving over 
conventional techniques). Since the circuitry at the voltage source 
is not developed in this example, an assumption will be made that 
the faults Eire similar for each wire. 

The 3x3 matrix drawn in Example 2 points out the combinations of 
interest in this particular analysis. The boxes which contain a 
"one” are the combinations of concern. These boxes, figuratively 
speaking, say that "the faults allowing power on wire A1" are ANDED * 
with "the faults causing wire A1 to short to wire B", and "the faults 
allowing power on wire A 2" are ANDED with "the fatal ts causing wire 
A2 o short to wire B", and "the faults allowing power on wire A3" 
are ANDED with "the faults causing wire A3 to short to wire B" which 
are all summed together by an OR gate. 

The significance of using a Matrix Gate in Example 2 may not be 
readily apparent, but suppose the four-wire cable had been a 50 wire 
cable. Instead of drawing 50 iterations of wire shorts combined with 
faults allowing power on the wire, tbj Matrix Gate requires only one 
iteration of the combination. The tediousness of drawing and reading 
superfluous information has been eliminated, yet the necessary 
information is not lost. 

2. 4. 2. 2 Condition Type Matrix Gate 

Example 2 demonstrated the Matrix Gate with both of the inputs as 
variables. That is, both of the inputs to the gate consisted of 
fault conditiona. A second, and slightly different, way of using 
the Matrix Gate is with one input as a variable and the other input 
as a condition. This type of usage is fitted for situations whereby 
the Matrix Gate is employed to replace Inhibit Gates which have 
similar or redundant inputs. Example 3 depicts this type of usage. 
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2.4*2. 2 (Continued) 

Example 3 (Figure C5) deals with a car and highway situation. In 
this example a car is analyzed for the undesired event "car wreck" 
and the only failure modes being considered are: 1) blowout, 

2) loss of steering, and 3) brakes locking. In addition to analyzing 
the car to determine the causes of these failure modes, certain road 
conditions are placed on each failure mode. These conditions are: 

1) the road being wet, 2) the road being dry, and 3) the road being 
icy. 

As is apparent from the fault tree shown in Example 3, the variable 
inputs to the Inhibit Gates are redundant, and result in a unique set 
of combinations. This unique set of combinations results in a long 
and repetitious fault tree, which can be effectively reduced in size 
and complexity as shown. 

The 3x3 matrix shown in Example 3 demonstrates that nine unique, 
but related combinations result from this particular example. . Further- 
more, it shows which fault event is combined with which conditional 
event, and the number of times each event is combined. 

2.4*2 .3 The Matrix 

Now that the Matrix Gate has been exemplified in a simple and concise 
manner, a small adjustment factor must be introduced. This adjustment 
factor involves the "one" and "zero" placed inside the boxes of the 
matrices. These numbers are in actuality probability numbers which 
• represent the probability of an Inhibit Gate allowing each combination 
(of fault events) to result in the undesired event. To be specific, 
an Inhibit Gate is located between each AND gate combination and the 
summing OR Gate. This "hidden" Inhibit Gate does not appear in the 
fault trees of Examples 1 , 2, and 3 because the probability of a 
particular combination resulting in the undesired event has been 
assumed as one or zero. When the probability was zero for a certain 
combination this meant that the combination was either impossible or 
not desired for analysis. When the probability was one for a certain 
combination this meant that when the two events occurred, the undesired 
event was immediately realized. The probability of the combination 
resulting in the end event is not always one or zero, but frequently 
some value in-between. 

Example 4 (Figure C6) is a continuation of Example 3, except the 
"hidden" Inhibit Gate is shown in the diagram. This example demon- 
strates the probability involved for realizing a car wreck given that 
a car fault occurs and the appropriate road condition is fulfilled. 

Take for example the fault tree path "blowout on a wet road". When a 
blowout occurs and the road is wet it does not necessarily follow that 
there will be a car wreck. There is a certain probability involved 
for a blowout on a wet road to result in a wreck, and this probability 
is represented by an Inhibit Gate condition. The probability of this 
condition is placed inside the matrix which accompanies the Matrix 
G a Ce • 
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2. 4* 2. 3 (Continued) 


The probability numbers In the matrix should not be taken as the 
probability of two fault events being combined together. These 
numbers indicate the probability that two combined fault events 
will result in the undesired event after they have statistically 
been combined. Example 5 (Figure C7) shows the generalized case 
and the mathematical equations involved. 


2. 4.2. 4 Conclusion 


The preceding discussion provides evidence that the Matrix Gate 
and its associated matrix successfully represent a condition of 
similar or redundant fault event combinations in a simple and 
concise form while at the same time yielding all of the qualitative 
information involved. 
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Generalized Matrix Gate - Mathematical Equations 
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2.4.3 Transfer Symbols 

The "transfer" symbol is used to allow continuity between two 
parts of a fault tree. A line drawn into the side of a triangle 
transfers everything below that triangle to another location, 
which is identified by a triangle with a line drawn from the apex 
and containing matching nomenclature and identifying symbol. The 
methodology is illustrated below: 


relay XK 12 I 
fails closed 



r 

relay XK 12 
fails closec 



nomenclature 

identifying 

symbol 


Two types of transfer symbols exist. The "internal" (local) transfer 
symbol transfers portions of a fault tree only within a particular 
diagram. The idea behind this being that whenever the development 
of a certain portion of fault tree is identical in two or more places 
on the same diagram, it need only be developed in one place. 

The "external" (global) transfer symbol transfers a portion of a 
fault tree to another, entirely separate, fault tree diagram. This 
happens when a development is identical for one event on two separate 
diagrams. Also, when a diagram is developed until there is no longer 
room for further expansion on the sheet (or it is desired to end at 
a particular place) an external transfer is used to continue develop- 
ment on another sheet. This is the method by which new fault tree 
developments (sub-diagrams) are started. 

Figure C8 is cm example of transfer symbol usage. It shows the 
correct use of both internal and external transfers. 
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2.4*4 Output Encompassing Ellipse 

An ellipse with a line extending out along the major axis Is used 
when a component appears several times at the same place (e.g., a 
10-stage counter where all 10 stages can be represented by Illus- 
trating one stage). Only one of the inputs is drawn to encompass 
the output. This indicates that the failure rate of that event is 
to be multiplied by the given factor (times 10 for the 10-stage 
counter) for an tt 0R n gate or raised to a given power and multiplies 
by the expression, (n n_1 ) for an "AND" gate. This symbol is 
illustrated below. 



2.5 EVENT IDENTIFICATION 

All events comprising a fault tree must be identified by a code. 

This is necessary for four reasons: l) easy and precise referencing, 

2) for purposes of machine drafting, 3) in order for a log of events 
to be maintained, and 4) for purposes of quantitative evaluation. 

The means by which events are identified is generally dependent upon 
the requirements and objectives of the particular analysis. A 
standardized procedure should be set up and adhered to for an entire 
analysis program. 

The size and complexity of aerospace systems has demanded that a 
unique method of event identification be utilized. A method has been 
developed to satisfy the requirements and objectives of the Apollo 
system fault tree analysis, plus allowance for future expansion or 
quantitative evaluation. 

All events are classified into one of two categories. These two 
categories- are referred to as "global" events and "local" events. 
Global even is are defined as events which are used on mere than ens 
fault tree diagram, and local events are defined as events which are 
unique to one fault tree diagram. .The notation (or code) for events 
allows each event to he uniquely represented, at the same time 
differentiating between global and local events. The standardized 
notation is shown in Figure C9. 


SHEET C-225 


l» i 4401 1414 MV. 1-4* 


** 85 . • 










USE FOR TYPEWRITTEN MATERIAL ONLY 


TH* 


COMPANY 


‘ NUMBER D2-1 19062-1 
REV LTR 


2,$ (Continued) 

From Figure C9, it can be readily discerned that the alpha 
character identifies the type of event. That is, "W" indicates 
a house, "X" indicates a circle, "Z" indicates a diamond, any 
"I" indicates an oval. Local events are numbered from 01 through 
99 for each and every diagram. For example, diamonds on the AAA 
diagram are randomly numbered as Z01, Z02, Z0^, etc., and diamonds 
on the RAA diagram are also numbered as Z01, *92, 103 , etc.. The 
only uay to differentiate between local events is by indicating 
the fault tree diagram on which they are located. Global 3 vents 
are numbered from 100 through 999 and an index must bo uf'd tc 
locate diagrams on which these events appear. 

For the identification of global transfers (sub-diagrams) a three 
character alpha system is utilized. Using three alpha characters 
allows identifying nomenclature for a possibility of 17,576 
diagrams. In conjunction with this method, a breakdov. ;an be 
established which immediately identifies the source of each diagram. 
This breakdown consists of delegating the first letter, of all 
three letter combinations, to a particular MSF Center, contractor, 
or analyst. 

As shown, local transfer symbols are numbered from 01 through 99 
for each fault tree diagram. When referring to a particular local 
transfer, the diagram on which it appears must aj so be given. 

2.6 BASIC DIAGRAM METHODOLOGY 

The development of a fault tree diagram commences with the 
definition or identification of the top "undesired event" to be 
analysed. The top undesired 'vent can be an encompassing event, 
such as "mission loss", indicating a complete system analysis, it 
could be a limiting event, such as "crash due to engine failure,, 
or it could be a specific event, such as "amplifier fails resulting 
in low output", indice ting analysis beginning at a hardware level. 
Once definition of the undesired event has been accomplished, the 
system is analyzed using the following rules and definitions of 
fault tree diagramming to determine and model the inter-relation- 
ships and combinations of both normal and abnormal system functions 
which could cause the occurrence of the top undesired event. 

The next step is to divide the system operating mod ls into phases. 

A phase is that increment of a system's life which can be analyzed 
independently, yet recognizing that there may be commonality of 
analysis between any of the phases. System phase breakdown should 
continue (corresponds to system engineering functional analysis) 
until the environment stays relatively constant through the phase 
element and system operational characteristics do not change the 
fault environment. The de :olopmc.nt of a fault tree proceeds through 
the identification and combination of the system events (normal 
and fault) until all fault events are definable in terms of basic 
identifiable hardware faults, to which failuro rate data can be 
applied. 
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2.6 (Continued) 

Figure C1C shows the general relationship of fault tree 
segments. Although shown as distinct elements, it should 
be noted that the segments will, to a certain extent, "mix" 
together throughout the fault tree structure. 


Undesired 

Event(s) 


System Phases 


Identification of Cause 
Sources (fault flow) 


Primary, Secondary & Command Paths 


Primary Event Identification 



Structure 


4 - 

Major System Flow 


Sub-System Flow 


Detailed Hardware Flow 



Figure CIO 
Fault Tree 


Developing the "fault flow," or cause and effect relationship 
of events through a system, requires deductive reasoning at 
each "gate event" or level of the fault tree. This deductive 
reasoning basically involves the answering of five questions: 
l) necessity, 2) sufficiency, 3) primary, 4) secondary, and 
5) command. These questions effectively develop the structure of 
the fault tree on a progressive, or level-by-level, basis. 

To answer the questions "necessity" and "sufficiency" requires 
an evaluation of the system for normal and abnormal functional 
event relationships. This evaluation determines the system unique 
events, and logic gates combining them, to result in the undesired 
event. This is accomplished by looking at the unde3ired event and 
asking, "Uhat is necessary and sufficient to esuse this undesired 
event?" For example, an ordnance device will be activated when 
two events occur: 1) the ordnance device Safe ar.d Arm mechanism 

closes, "AI.D" 2) energizing power is available on the ordnance 
device ignition line. These two events are all that is "necessary" 
and "sufficient" to cause activation of the ordnance device. 
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2.6 (Continued) 

The questions "primary" and "secondary" are questions requiring 
an evaluation of the system to determine what primary and/or 
secondary fault events can occur to result in another fault event. 

A concise definition of "primary" and "secondary" failures: 

primary Failure t Failure initiated by failures within, and of, 

the component under oonsideration, e.g. , resulting 
from poor quality control during manufacture, 
etc., applied only to the component during Fault 
Tree Analysis wher. a generic failure rate is 
available. 

Secondary Failure : Failure initiated by out of tolerance oper- 

ational or environmental conditions, i.e. , a 
component failure can be initiated by failure 
not originating within the component. 

These questions also help to identify the specific failure modes 
of the fault event. For example, a primary failure mode of an 
ordnance device would be the mode of auto- ignition. A secondary 
failure mode would be that of ignition due to excessive external 
shock or heat. 

The question "command" is really a guideline for development 
through the system. The question asks, "What upstream event will 
command the downstream event to occur?" The upstream event may 
be a primary and/or secondary event, or it may be sin event 
commanded by an event further upstream. 

A concise definition of " comma, nd"failureS • 

Command Failure :* The component was commanded/ instructed to 

fail i.e., resulting from proper operation at 
the wrong time or place. 

Essentially, the "command path" is a chain of events delineating 
the failure path of command events through a system. The command 
path ultimately results (at the finish of the analysis) as a 
primary and/or secondary fault event. Take for example, a set of 
relay contacts failing closed, as part of a system function. The 
contacts may fail closed as a primary failure, they may fail closed 
from a secondary cause such as foreign material bridging the con- 
tacts, or they may be commanded to close by a relay coil failure. 

If an upstream event causes tho relay coil to be energized, the 
contacts are effectively "commanded" to close as a result of this 
upstream event. 

* Component may not always have command failure mode 
(e.g. a standard bolt) in which case this mode may 
be disregarded. 
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(Continued) 

The effective inter-relationship of the five necessary deductive 
questions is shown below: 


Fault 

Tree 


0 


necessity 

sufficiency 


0 


primary event 
secondary event 
command event 


As indicated, a fault tree is constructed of primary events, 
secondary events and command events through the medium of necessity 
and sufficiency. 

In developing a fault tree certain thought processes take place 
in the mind of the analyst. The steps of development at each 
level of the fault tree delineating these thought processes are: 

1) Define the undesired output event; 

2) Determine what is "necessary and sufficient" to produce 
the undesired output; 

3) List all primary events related to the undesired output; 

4) List all secondary events related to the undesired output; 

5) Define the undesired input event which could command the 
output event; 

6) Repeat steps 1-5 for the new undesired event defined in 
step 5. 

Figure C11 shows the relationship of the above steps to the 
structure of a fault tree. The inherent simplicity and logical 
process is readily apparent from this example. 
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Figure C11 

Fault Tree Relationships 
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(Continued) 

Figure Cl 2 shows a logic diagram structure which portrays the 
relationship of the command event to the primary and secondary 
events, and also how command events lead to a "command path." 

It cast be remembered that the commend path, as such, is only 
a guideline for analysis of event development through a system. 
Command events create an orderly and logical manner of analysis 
at each level of the fault tree. Once an analysis is completed, 
comparison between the fault tree and signal flow diagram will 
show that the fault tree "command path" of a branch will represent 
the steps of signal flow along a single thread. 
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Example of Comiuoid Path 
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2.7 THE HUMAN ELEMENT 

Any system which requires the human element in order to perform 
its intended function must have an analytical development that 
includes the human as part of the system. The human element is 
a complex subsystem, and human cause and effect relationships 
must be an integral part of the system's fault tree structure. 

An example of how the human element can be portrayed in a fault 
tree is shown in Figure Cl 3. The top event defines any arbitrary 
human operation and is used merely to illustrate the development 
below the event. The circle shown as "Crew Member Fails to Perform 
Function" (the identified critical function) represents the 
possibility of inadvertent error, usually highly improbable. The 
other two inputs to the top "0R‘ l gate represent the "command" 

(no input information) development and the "secondary" cause 
development. Either of these two branches will most likely contain 
the dominant factors associated with failure of a crew member to 
perform a critical function. The events shown in this fault tree,. 
Figure Cl 3, are examples of the types of causes which could result 
in no action taken by a crew member. There are others which for 
simplicity are not shown in this illustration (indicated by dotted 
lines). 

2.8 DOMINANT PATHS 

A dominant path is the chain of events which is most "likely" to 
result in the undesired event (potential accident). In a typical 
case, there may be several paths of various degrees of dominance 
which can result in a given undesired event. These chains and 
their associated degrees of dominance are most clearly identified 
by the system safety model (fault tree or logic diagram). Dominant 
paths and their relative degrees of dominance are determined by 
event weighting (inspection) or rigorous mathematical solution of 
the model. 

Since the dominant path is the most likely avenue along which the 
undesired event(s) can occur, the most cost eff -tive approach is 
to concentrate the initial prevention effort in this area. It may 
be necessary to. consider other paths within the model, in a 
descending order of dominance, in order to achieve an acceptable 
level of risk for the occurrence of a particular undesired event. 

Preparing to locate dominant paths requires that the system safety 
model for a given undesired event (potential accident) has been 
developed to the extent necessary to identify dominant paths. As 
a minimum, the fault tree development, which is the model, must 
encompass all these safety features and devices which have been 
designed into the system. This Assures that adequate consideration 
has been given to those areas of the system which are of the greatest 
"risk," since safety devices are normally placed where the greatest 
risk of an undesired event occurring exists. 
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2.8 (Continued) 

Logical inspection or mathematical processes determine the degree 
of dominance for those paths of the model which contribute the 
most to the likelihood of the undesired event. The term "logical 
inspection" is defined to mean the logical thought processes of a 
trained and experienced analyst being applied through examination 
of the model. These processes, associated with weighting factors 
he may consider, lead to the resulting statement by the analyst 
that "these events (identified) and path(s) appear to be the most 
probable." 

The term "mathematical process" can be a solution of the model by 
any of several methods. Normally, a diagram with 250 events or 
less is solved by the Lambda-Tau (hand calculated) method, and a 
diagram with greater than 250 events on a digital computer using 
Monte Carlo simulation with importance sampling. An event in this 
case is defined to be any element of the d5.agram other than a logic 
gate. Since the purpose of the quantitative evaluation of a 
diagram is to identify dominant paths and their relative signif- 
icance, the diagram is usually simplified by inspection to minimize 
the structure to be simulated. This i. pection is the elimination 
of those events and branches which are obviously insignificant 
compared to others which are inputs to the same gate. 

Control of dominant paths is accomplished by the following: 

1) Establish a predetermined limit within which the initial path 

selection is bounded. This involves the identification of 
those paths which are computed to be above any established 
limit for the system. j. 

If the paths are near or below the limit, then they are 
selected by picking those which are within an "order of 
magnitude" or so of the limit, or are of the same type. 

2) The initial selection must be divided into groups for which 
a set of predetermined limits has been established for each 
grouping. The grouping of paths is accomplished by selecting 
those within an order of magnitude of each other or those 
which have an apparent commonality within the system. 

3) Determine if a common point of departure exists among the 
paths of each group. This evaluation involves determining 
if there are common faults among the paths. Recommended 
changes to the system at these common points provides the most 
effective way to eliminate paths, or at least reduce them to 
an acceptable level. 

4) Convert the fault tree dominant paths np grouping events at 
logical summary points. Conversion of the fault tree dominant 
paths involves making a listing of these events which, when 
"0R"-ed, result in an interim event. The method is to convert 

each path to u simplified alternating "AND," "OR," "AND," "OR," 
etc., relationship. 
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2.8 (Continued) 

5) Simplify the fault tree of the dominant path by logically 
re-diagramming. Simplification involves re-diagramming the 
relationships summarized in step 4 . This results in a 
simplified diagram of each path which can be readily correlated 
with a functional flow diagram of the system. The paths can 
now be verified as to accuracy and the actual fault points 
introduced into a functional flow diagram to show where and 
how the fault combinations affect system operation. 

6) Determine those events for which a design change or the 
development of a procedure will best and most cost effectively 
reduce the probability of occurrence of an undesired event to 
an acceptable level of risk. 

7) Insert alternative solutions as derived by steps 1 through 6 
and repeat the process until an acceptable level of risk is 
obtained. This step involves working with designers and 
selecting several alternative system changes to reduce the 
probability of occurrence of each path. For each alternative 
to be evaluated, the fault tree is changed to reflect the 
change and the diagram is recomputed to determine the change 
impact. Care must be exercised to assure that other paths or 
branches of the tree which have the same event or fault 
sequence are also changed to reflect the change being evaluated. 

8) Advise appropriate level of management of findings and 
recommendations. 
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2.9 FAULT TREE EVALUATION 

2.9.1 Failure Data Development for Fault Tree Evaluation 

Failure data is developed as a tool to define the effects of 
various component failure modes and classify these effects on 
system equipment or personnel. The format in Figure C14 is pro- 
vided for assistance and guidance in developing system safety 
failure data. This format can be changed according to various 
requirements and should be considered as an example only. 

The various columns are explained as follows: 

COLUMN I - COMPONENT 

Components are defined, at the discretion of the analyst, by 
their physical or functional significance. The following guide 
will facilitate understanding of the types of natural separations 
to consider. It is not intended to be exhaustive. 

1 ) Electronic Logic Circuits 

Many systems or subsystems are made up of a number of basic 
circuit designs which perform an identifiable purpose. These 
are used as building blocks for larger circuits designed to 
perform the required logic functions of the system or subsystem. 
To minimize the analysis required, the basic circuits can be 
defined as major components, and an analysis made of each logic 
function. 

2) Mechanical Devices 

Mechanical devices can be either a single part or an assembly 
of parts which perform one function. The use in the system 
will dictate to what level of detail mechanical parts should 
be considered. Single parts which can be considered major 
components are: solid driveshafts, engine blocks, primary 

structure, etc.. The majority of mechanical devices will be 
assemblies of many parts and it is more reasonable to treat 
the assemblies as major components, for example: relays, pumps, 

motors, mechanical safety devices, etc.. This permits the 
majority of vendor-supplied mechanical devices to be analyzed 
as major components. 

3) Electrical Systems 

Major components can be basic components of a circuit or combin- 
ations of components used to perform one single function such 
as snplifiers, rectifiers, or regulators. The level of data 
development should he based on the importance of the part as a 
functional element in the design. 


SHEET c-237 


U$ 4S0t 1414 « C V • |.|| 







USE FOR TYPEWRITTEN MATERIAL ONLY 


THl 


company 


NUMBER D2-1 19062-1 
REV ITR 


2*9*1 (Continued) 

4) Chemical Systems 

In systems containing chemical compounds, the chemicals 
should be considered as major components if these compounds 
can cause failures of other components through chemical 
reaction or release of chemical energy. Examples of chemical 
components are: fuels, pressurants, coolants, and preservatives. 

5) Safety Devices 

Safety devices will normally be considered major components 
since they are used primarily to protect against undesired 
events. 

6) Wiring 

Interconnecting wiring of major components will be considered 
a major component. Internal wiring will be considered as a 
part of a major component. Physical characteristics of cables 
which circumvent failures between wires should be stated in 
the cable analysis. 

COLUMN II - COMPONENT FAILURE MODE 

Failures of major components consisting of one part require a 
listing of the modes in which that part may fail. Failures of 
major components consisting of more than one part will require a 
failure mode and effects analysis to determine how the failure 
modes of each part affect the components’ output. These part 
failure effects will be the failure modes of the major component 
listed in the system safety failure data. All failure modes of 
the component should be listed. 

COLUMN III - COMPONENT FAILURE RATE 

The predicted reliability of the failure rate computed from actual 
field data of primary failures should be tabulated in this column 
for each major component in each of its nodes of failure. This 
data can be used in evaluating the probability of the fault event 
or in selecting which critical or catastrophic events should be 
analyzed if the decision is made not to analyze an event so class- 
ified. It also serves as a data bank for future reference when the 
need arises to analyze other undesired events as a result of 
system changes. 
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2.9.1 


(Continued) 

COLUMN IV - SOURCE OF DATA 

This column states the source of the failure rate data. 

It show the differentiation between field data, test 
data, calcualated data, etc.. 

COLUMN V - FAILURE STATE 

Many major components are recurrently activated during the 
system’s operational life. The level of stress on these 
components will change from one system mode to another. The 
effect of a failure in each mode can be differenc; for example, 
components supplied with power only during s best can create 
a fault hazard only while a test is performed. Failures existing 
in one mode of system operations can also adversely affect the 
system when the mode is changed. This column therefore should 
reflect the environmental state of the component when it failed. . 

COLUMN VI - EFFECT OF COMPONENT FAILURE 

This column states the effect on rela 1: ! system equipment and/or 
personnel due to the component failure. 

COLUMN VII - REMARKS 

This column may be uc«*d to include additional Information needed 
to clarify or verify information in other columns as well as 
other information currently pertinent to system safety efforts. 
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2.9.2 Fault Tree Quantitative Evaluation 

After the fault tree haa been constructed and Input data acquired, 
the tree can be evaluated. The object la to establish the likeli- 
hood of occurrence o'* the "undesired event' 1 and to evaluate the 
relative contribution of each Indicated failure node. With this 
Information the safety analyst can identify the dominant system 
failure inodes (dominant paths) and management can make the decision 
as to whether or not corrective action is warranted. 

Two basic approaches used to quantify fault trees are l) calcula- 
tion, and 2) simulation. The calculation or deterministic approach 
will be considered fi. it. For fault trees where every basic input 
In non-repairable, classical probability can be used. In this case, 
each gate merely represents the operation to be performed (i.e., 
union for "OR" gates and intersection for "AND" gates). The class- 
ical probability approach, while simple and efficient, is not 
adequate for fault trees where the effects of a basic failure can 
be eliminated before causing the unde si .1 event. A basic failure 
whose effect can be removed is called repairable; however, the 
usage of the word "repairable" is irregular because the offset may 
be terminated without actually repairing or replacing the failed 
item. A more definitive time is "favlt duration time." '"he analysis 
of repairable systems requires special statistical techniques. 

2.9. 2.1 Computation 

One technique in the oalcualtion oi deterministic approach is the 
"L»» mbua-lau" method to evaluate fault trees. In this method, 
failure rates must be small, fault duration times must be small 
with regard to mission lingth, and redundant inputs must be removed. 
Redundancies that are not removed may lead to serious unbounded 
errors in the answer. The fault tree diagmis are usually 
expressed algebraically and operated on by theorems of Boolean 
algebra to remove redundancies. The "Lambda-Tau" method can be 
applied by hand or by digital computer. However, as the fault 
trees get larger in size, the task of hand calculation becomes 
tima consumin’, laborious and error prone. A computer program can 
write the algebraic expression and can use Boolean algebra to 
remove the redundancies. However, computer core storage on most j 
computers Units the size of the tree solvable by this method. 
Nevertheless, smaller fault trees can be calculated accurately 
by hand or computer using "Lambda-Tau" methods. (See Section 2 . 9.3 
for further details.) 
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2.9. 2. 2 Simulation 

In the simulation approach, a fault tree is represented on 
a computer and failures are simulated over a given mission 
length. The computer prints out the failure which leads to 
the undesired event, and the probability is calculated. The 
simulation approach has all the advantages of the calculation 
approach except for the greater amount of computer time needed 
to simulate fault trees with small probabilities. Simulation 
offers several advantages: namely, the dominant paths are 

listed and the computer can solve larger diagrams (10 times 
larger than "Lanbda-Tau" ) . Simulation has gone through many 
stages of development. In its early stages, the amount of 
computer time required became prohibitive; however, special 
Monte Carlo variance reducing techniques (importance sampling) 
have reduced greatly the computer time required. The importance 
sampling technique distorts the true failure distribution to 
make events occur more rapidly. Thus, the number of trials (a 
trial represents the predefined mission length of the system) 
required for an acceptable statistical confidence is reduced. 

With fewer trials required, computer time is reduced. The 
distortion of the distribution, when using importance sampling, 
is compensated for by calculation weight factors. See Nagel, P.M. , 
and Schroder, R.J., "The Efficient Simulation of Rare Events in 
Complex Systems”. D2-1 14072-1, The Boeing Company. Overall, 
simulation offers more potential and has proven to be more effective 
in calculating accurate answers than the ”Lambda-Tau” calculation 
method. 
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2.9.3 


Constant Repair or "Lambda Tau" Method of Fault Tree Evaluation 


2.9. 3.1 Coexistence of Independent Failures 

Suppose there is given a group of n repairable items, and these 
items may or may not fail in a given time period, T. Let event 
A. represent the failure of item 1 , event A 2 the failure of 
item 2, and in general event the failure of item i, i= 1 , 2, 
...,n. These failures are chance failures, occurring at random 
and independent of each other. It is these chance failures 
which have an exponential distribution of their time to failure. 
Hence the probability that an item in that group will not fail 
may be expressed as the reliability, 

Ri(t) = e"Vi (1) 

where t^ is the given time period, andAi is the number of 
failures per unit time. The unreliability or chance of failure 
is 


Qi(t) = 1 -R ± (t) = 1 - e “Vi 


This unreliability may also be called the probability that 
item i will fail during time ti, and is the probability that 
event Ai will happen. For each item I assume that the failure 
rate Ai and repair time Ti are constant. Further assume 
that T i/T, Ai, and are small. 

Consider an interval of time from 0 to T as shojn in the 
figure below. 


: k-r 


t + dt 
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2.9*3* 1 (Continued) 

In order for a failure to exist in the small time interval dt, 
the failure must occur either in the small interval dt, or in 
some time interval from t - to t. If the failure occurs 
before t - T^, it will be repaired before it can exist in the 
dt interval; and if it occurs after t + dt, it cannot possible 
exist in the dt interval. The probability of event A± happening 
in the period is (l - e“*i r i). The probability of event A^ 
happening in the dt time interval is A-jdt. These are the only 
two ways in which the event Ai can happen. The probability for 
all events, A-j, A 2 > ..., A n to coexist in the dt interval is 
given by 

Hdt= A-j dt ( 1 -e“V*2 ) ( 1 -e” *3 T 3 ) . . . ( 1 ~e" Vn) 

+ l?_dt(l -e-*i r i)(l -e-V3 ) . . . ( 1 -e“ Vn) 


(3) 


+ X n dt(l-e~*i '•) ) (l-e“V*2) • • * (l-e“*n-1 "*n-l ) 


Consider the first term in this formula, which is the probability 
that event A-j occurs during dt and coexists with the other 
failures having occurred previous to t. The probability of 
event A-j occurring in dt is Vjdt, and the probability of occur- 
rence A 2 during period 7*2 Previous to t is (l-e“^2 7 '2). The 
product of these probabilities for events A-| through A_ gives the 
probability of the coexistence of all events, where only A-| occurs 
during dt. The second term gives the probability of the coexis- 
tence of A-j, Ap, ...A n where only A 2 occurs during the interval 
dt. The sum of these n terms equials the probability of n events 
coexisting during dt interval. 


Let f(t) be the probability that A-,, A 2 ,...A n have not coexisted 
up to time t. Then f(t + dt) expresses the probability that 
A-j, A 2 , •••An have not coexisted from time 0 to t + dt. This 
can be expressed as 


f(t + dt) = f(t) (1 - Hdt) 

Where f (t+dt) equals the product of the probability of no 
coexistence of the items A. through A„ from 0 to t, f(t), and 
the probability of no coexistence of the items A^ through A n 
from time t to t + dt, (l - Hdt). 


(4) 
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(Continued) 

By definition, the differential of f(t) is f(t+dt)-f(t); therefore: 
df(t) = f(t) h - Hdt) - f(t) 
df(t) = -f(t) Hdt 

and = -Hdt 

Solving this differential equation by integration, 

In f(t) = -Ht + C 

(5) 

At time zero, the probability that A-j, A2>.**A n have not 
coexisted is equal to 1. Then f(t) — 1 when tr=0, and In (l) = C. 
Since ln(l) = 0, then, from (5) 

In f(t) = -HT (o) ' 

f(t) = e-HT 

The probability that events A-| through A n have coexisted at 
some time t is 

P(A) = 1 - f(t) = 1 -e- HT 
For sufficiently small HT, 

(7) 

P(A)~HT. 

Hence pCaJ^HT = • .AjjTn 

+ *2*1 r 1 ? * • * *Vn 


+ •• •• • ^-n-1 7" n-1 ) T 

= ... Xn (^ 2^3 ••• 7n + ^ 1^3 

r n + - +T 1 T 2 i)T 
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2.9.3. 2 "AND" GATE X 

The form of the probability figure for the coexistence of 
failures A-j, A 2 » • • . A n , suggests that the failure rate 
for all these events is 

= **i*2 * * * * * * *^n • • • "^n ••• * 1^2 * * **^n— 1 ^ 

2.9. 3. 3 "AND" GATE T 

Consider a situation in which events A-j , A 2 > . . . A n+ -j must 
coexist to produce an undesired event. No output will occur 
for the duration of the time T n when only events A^, Ag» ...» 

A n coexist. Let X n be the failure rate and T n the effective 
period of coexistence of failures A-j through k a . An expression 
for the period T n is derived as follows: 

*n*n+1 fan + ^n+1 ) = *1*2 • • • *n+1 ( T 2T3 • • • ?n+1+. . .T-jTj • • • T n ) • • 
Since * n = * 1*2 ••• *n^ 1 2^*3 ••• ^*n+^1^"3 ••• 7*n •••^lT2 ***^ 11 - 1 ) 

Then *2 ... * * *^"n ^3 • * *7* n t ... 7* ^ Tjj • • • ) 

^ n+l (7" n + 7^+1 ) = 

<*1*2 * *^h+1 ^2^3 •••^'n+1 + ^1^3 • • *^n+1 + ••• +, 1^2 


Therefore: 

r = — JA lA^ L n 1 

n r 2'3 ••• / n +^*3 •.•”'n+“^|72 •••^n-1 J 1 . , 1 

7% 'r + ••• + r 
*1 *2 r n 

by mathematical induction. 


\ 6 
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2.9*3. 4 


2.9*3. 5 


UJ 

O I 


"OR" GATE X 

Considering the same group of n items, i = 1, 2, ... n, 
the probability that none of the events occurs during 
time period T is given by 

R^t) = e"*! 1 e“*2 T e“*3 T . . . e"V 
R^t) •- e + *2 + * * • +X n) T 
Hence the probability that any one of the events occurs is 

Qi<t) + 1 - Bj. (t) = 1 - e - '* 1 * + * • * + * n) T 

Therefore the failure rate for the occurrence of any event in 
the time interval is A. u = ^ + Jl 2 + . • •+ ^»n ^ rom 'the general 
form of the reliability equation. 

"OR” GATE T 

To find the effective duration for the condition that any one 
of the group of items may fail in the time period, consider the 
following example. Let any one of the events A-j, A 2 , . . *A n 
coexist with an event A n+1 . Let A„ and 7" u represent respectively 
the failure rate and effestivity time obtained from the union of 
events A-j to A n , when event A* or A 2 or A-j, * • • Ajj occur in 
the given time interval. If -these events' 7 ^ , A ? , . . . A n occur 
with event A n + 1 , the result is A-uAn+i (T u + r n +i ) from the 
coexistence of failures discussion, and 

, , (?+* (r + r )+u (r +r ) + ... 

' , ni / 'n+1 u n+1 1 n+1 1 n+1 2 n+1 2 n+1 

+ A. ( 7 + r ) 

n n+1 n n+1 

Since X = X + X + . . *j. A.n 
u 1 2 

Then 

+ +2 + * •• +A.nU. +1 (T u +^ + i ~2 + ?a+i + 

Therefor. ‘ ” ****** ( '~ + ? ' r+1 > 


r„ - h h 151 


■ n n 


The outputs of the AND and OR gates are given in. tabular form 

at the end of this paper. 
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2.9*3. 6 Failures Ocouring in a Given Order 

The probability expression for n items failing in an interval 
of time in a given order will be derived in the following 
discussion, and an approximation for small XT will be shown. 

Consider a group of n items, A-, , A 2 » . . . A n , each working 
at the beginning of an arbitrary interval of time, T . Let 
Ai , A 2 > ...A n be the respective failure rates of the n items, 
and suppose that , . . . X^T are very small. Let E be the 
event which occurs when A-|, A 2 , . . . A n all fail in some 
specified order, e.g., A*| occurs, then A 2 , then A3, etc., then 

n 

P(E, ~ - ^ ~ 


* •* A. in T’ r1 

In previous discussion, the expression * PT] 

was obtained for the probability of occurrence of n events 
A-j , A 2 , . . . A n in a particular order over a time period T . 
Using these results, the probability will now be obtained for 
the occurrence of four events in order over a time period T 
when repair times are unequal 

Let four events A -j, A?> A3, A, have respective repair times 
~1, ~2> ~3> and failure raxes *4. Let the 

magnitudes of the repair times have the relationship, 

7*i >?' 2 >r3>?'4, as shown below 


1 

[—. r. 

1 

[ • 

1 

L. - _ 

1 

1 < 

1 

1 

i 

; 

f 

0 

t f 1 1 , 

t\ - 6 -j— — T t -T 3 — «»j 

1 

t 1 

* 


*• t + d t 

tim e r _ 
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2 . 9 . 3 . 6 (Continued) 


For this particular example event A-| shall occur first, then 


Jk.2$ then A^. , then A^. Events A-j , A2, and Ao shall occur prior 
to t and event A4 shall occur in the dt Interval. The prob- 
ability of A 4 occuring in the dt interval is ^dt. To coexist 
with A4 in the dt interval, A-j, A2» and A 3 can occur in the 
five following ways: 

a. A-j occurs in interval T-j - 7 ^, A2 occurs in interval 
7*2 - T 3 and A3 occurs in interval T3 

p(a) = ^i(r 1 - T 2 ) A. 2 (f 2 - r 3 )i 3 T 3 


b. A-j occurs in interval -Tg and A2 and A 3 both occur in 


order in interval T 


p(b) = A 1 (r 1 -r 2 ) 


LU 

O g 


c. At and A2 both occur in order in the interval 7 % and A3 
occurs in the interval T 3 

p( c )=5i^Z • 

2 '3 


d. A 1 occurs in interval (7*2 - T3) and A2 and A3 occur in 
order in the interval T3 

P(d) • i,(r 2 -r 3 ) *£ 22 ? 


e. A-j, and A 2 and A3 all occur in order in the interval T 3 
Pfel = 
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(Continued) 

The total probability, P(t), for the occurrence of A-j , A 2 , A 3 
in order is the sum of these probabilities 

P(t) = P(a) + P(b) + P(c) + P(d) 

* ’-1V3 - 2 iZa_ 2 _ + I£ > . 

2 2 6 

The product of P(t) and , dt therefore, gives the probability 
that A - 1 , A 2 , A 3 , A/ occur in the given order and coexist for the 
first time in the at interval. If f(t) is defined us the prob- 
ability that A-j, A 2 , Ao, A, have not occurred up to time t in a 
given order, and fft +at) Is the probability of A-|, A 2 , Ao, and 
A^ have not occurred up to time t + dt in a given order, then 

f(t+dt) = f(t) (1 - P(t) X 4 dt) 

Since P(t) A^dt gives the probability that A^ , A 2 , Ao, k, occur 
in the given order, 1 - PCtjA^dt gives the probaoility that they 
do not occur as specified. 

f(t +dt) - ft = - f(t)p(t)A 4 dt 

df(t) = - f(t)P(t)A 4 dt 

= . p<t) * 4 dt 

lnf(t) = - P(t)A 4 J dt = - P(t)A 4 T 
-P(t)A 4 T 0 

«/ . a *r 


f(t 0 = 6 


1 -f(t) = 1 - e 


-P(t)A 4 T 


If P(T)A 4 T is small, then the probability of the occurrence 
of this chain of events over time T, P(l234) is 

P(1234) = V^V 4 (r i r 2 r 3 " 7-1 ^ - r 2 r 3 + T? ) T 

2 2 ~T 
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2. 9. 3 . 6 (Continued) 


By similar manipulations) the probability for the occurrence 
of A-]) & 2 > in that order is 

p (2i34) + t 

2 2 6 J 

Similarly 

r ~_2 t 3 j 

p +i 2 - 

(2314) A 1 A 2 / b A 4L 2 2 6 _ 

T 

«*. 3 

p (3214) = ; SV 3 \ T 

6 


r 3 T 

P (3124) = “ 6 " 


r rr 2 _3 r 3 

P (1324) * V'Au' 1 /' • ~2~ + “tl 

T 

The sum of these probabilities is 


P(4) = (T,7 2 7 3 ) I 


If A 3 is the last event, 7^ takes the place of < 3 on the 
figure and the resulting probability is 

P(3) = V~2V4 < 7 1~2'4> 1 


Similarly if A 2 and A^ are respectively the last events , the 
associated probabilities are 

P(2) x 


P(1) s T 
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2.9.3. 6 (Continued) 

Theee probabilities nay be added P(l), P(2), P(3), and P(4) 
mutually exclusive) giving the total probability of the 
coesistence of A^ f A 2 » A3, and A^. 

p = (r,r 2 73 +7^2? 4 + t 1 7 3 t 4 + 7 2 ~ 3 t u ) t 


It is to be noted that this is equivalent to the coexistence 
formula. Thus, the probability for the coexistence of events 
can be obtained as the sum of the probabilities of each ordered 
chain of events. 
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APPENDIX D - FRACTURE MECHANICS ASSESSMENT 


INTRODUCTION 


1.1 APPLICATION TO SAFETY ANALYSES 

One of the more hazardous elements in many systems is the subsystem under 
pressure. The fragmentation hazard of components under pressure is 
especially difficult to analyze because little is understood about the 
physical law governing the failure process. Improved accuracy of the pre- 
dictions of the time or cycles to failure can reduce the risk of equipment 
damage -nd personnel injury. The following sections describe a model of 
fracture mechanics which has been validated by experimental results. Use 
of this model in safety analyses will help to reduce risk levels associated 
with pressurized systems. 

1.2 DISCUSSION OF ANALYSIS METHOD 


1 . 2.1 


A list of symbols used in the mathematical model is included herein. 
Detailed descriptions of methods and derivations may be found in the 
references listed in Section 5.0. 


LIST OF SYMBOLS 

Plane strain stress intensity factor. 

Plane strain stress intensity factor at initial conditions. 

Plane strain critical stress intensity factor or fracture toughness 
of the material. 


Plane strain threshold stress intensity level. 


Semi-minor axis of the ellipse 


• x 2 - . + y 2 =i 


or crack depth 


Crack length of the semi-elliptical surface flaw. 

Thickness of plate (specimen). 

Complete elliptical integral of the second kind having modulus 1 
defined as k = (1 - a 2 /c 2 )^ 2 

Uniform stress applied at infinity and perpendicular to the plane 
of crack. 

Maximum design operating stress. 
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(Continued) 

Ultimate strength of the material. 

Uniaxial tensile yield strength of the material. 

Flaw shape parameter = <P ^ - 0.212 ( < ^/o , y S ) 2 . 

Stress intensity magnification factor for deep surface flaws based 
on Kobayashi's solution. 

Proof test factor = . 

Number of cycles. 

Time 

Ratio of minimum to maximum stress during a cycle. 



Subscripts 

cr at critical conditions 

i at initial condition 

op operational 

1.2.2 General 


The mi nimu m operational cyclic life of a pressure vessel at the Tnn-Hmim 
design operating stress can be determined if the proof test factor aC , mayirmnn 
design operating stress C ? , fracture toughness Kj„, and the experimental 
cyclic and sustained stress'Tlaw growth for the vessel materials are avail- 
able. Proof test factor with cip- and K^ c establishes the initial. and 
critical flaw size. For the cycles with the short hold times at the maximum 
pressure, the cyclic flaw growth data alone is sufficient to predict the 
number of cycles required to grow from the initial to the critical flaw size. 
If the vessel is to be pressure cycled with the prolonged hold timet. the 
maximum pressure, the cyclic as well as sustained stress flaw growth data are 
needed. Ine minimum remaining cyclic life of the vessel, in this case, is 
the number of cycles required to reach the threshold stress intensity Km H . 
Knowing the applied and anticipated pressure eye] e history of the vessel*,* 
t^e minimum remaining cyclic life of the pressure vessel at O' 0 p can b9 pre- 
f.ted nr.d the assessment of the vessel can be made with regard*to the 


fracture mode. 


regard 

This is discussed in detail in the following sections. 


Section 2.0 deals with the prediction of the cyclic life of a thick-walled 
vessel while the thin-walied vessel is treated in Section p,0. Section 4.0 
gives the experimental justification for the technical approach taken in 
Sections 2.0 and 3.0. 
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2.0 PREDICTION OF CYCLIC LIFE FOR A THICK-WALLED VESSEL 

Prediction of the cyclic life of a thick-walled pressure vessel can be made 
utilizing the proof test factor and the relations between K^/Xi c and cycles 
to failure for various values of R (ratio of the minimum to maximum stress 
during a cycle) for the material-environment combination. This can best be 
illustrated by an example. 

Suppose a liquid nitrogen- 5Al-2.5Sn(ELI) titanium pressure vessel is 
successfully proof tested with LN 2 to a factor of 1.25 X the maximum design 
operating pressure. For illustration purposes, it is assumed that the 
proof tested tank is subjected to the following pressure cycles before and 
during the flight. It is also assumed that all th9 cycles are applied with 
R equal to zero. 

1. 240 loading cycles with the maximum stress as 90 percent of C Q p. 

2. 70 loading cycles with the maximum stress as 95 percent of C Q p. 

3. A long duration flight cycle at C 0 p. 

It is desired to assess the structural integrity of the pressure vessel 
from the fracture mechanics viewpoint. 

The combined sustained and cyclic stress life curve for 5A1-2. 5Sn(ELI)Ti at 
-320°F is reproduced from Reference 8 in Figure D1 . Since the vessel is 
proof tested with c< = 1.23, the maximum possible % j/K-, c ratio that could 
exist in the vessel after the proof test at CT 0 p would be 0.80. This is 
shown by Point A in Figure Dl. Hence, at 90 percent of <T 0 p, Kyj/Kj c is 
0.72. The 240 loading cycles cf_ 0.90 <T 0 p as the maximum s cress change the 
K-j i/Kic ratio from Point A to Point B. Point B is 240 cycles to the left 
of Point A, with the cycles measured along the abscissa of the plot. Hence, 
the K^i/K^c ratio at the end of 240 cycles at 0.9 cf 0 p is 0.778. 

The stress is increased by 5 percent after the end of 240 cycles at 0.90 
C cr . The flew size remains the same during the stress increase. Therefore, 
the Kii/K lc ratio at the beginning of 70 cycles at 0.95 CT ... is (0.95/0,90) X 
0.778 = 0.821. This is shown by Point B in Figure Dl. p 


The ^0 cycler, at 0.95 c’ ck-.rge the X^/X^ ratio from Point B to Point C 
where Point C is 70 cycle! to the left of Point B in Figure Dl 0 K]j/ki c 
ratio at the end of 70 cycles at 0.95 Cf _ is 0.85. Hence, Kij/Ki c ratio 
based on <J op is (1.0/0.95) X 0.855 = P 0.895. 


The threshold stress Intensity value for sustained stress flaw growth for 
the material under LNp environment is 90 pt- cent of K^ c (8 ). Since at the 
beginning of the long duration flight cycle the K^/K^ c ratio is less than 
K TH/' K lc» ^* e ve££e ^- i £ considered to be safe for the flight. Also, it can 
be seen from Figure Dl that 10 cycles at a cp wil 1 raise Ki^/ T 'c c to the 
level of K T jr/Ki c . Hence, the estimated minimum remaining cyclic life for the 
vessel is 9* (1* - - lo,.j flight cycle) cycles. 


I 
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2.0 (Continued) 

Thi3 is the procedure followed in assessing the structural integrity of the 
thick-walled vessels. In the first analysis for the assessment of the 
structural integrity of the thick-walled vessel, it is always assumed that 
all the pressure cycles are applied at R = 0. Since the analysis based on 

R = 0 will always show the remaining cyclic life less than that based on 

the analysis ofR / 0 (actual R ratios), the prediction of cyclic life 
based on the analysis of R = 0 i3 invariably conservative. If the pressure 

vessel is shown unsatisfactory for the flight based on R = 0, then pre-J. 
diction analysis for the remaining cyclic life is conducted based on the 
actual R values at which the cycles are applied. For clarity purposes, an 
illustrative example is given below. 

Suppose a thick-walled 6A1-4V(STA) titanium helium tank is successfully 
proof tested at a proof test factor of 1.50 X the maximum design operating 
stress. Suppose the proof tested tank is subjected to the following 
pressure cycles before the flight, which is also shown inFigure D2. 

1. 200 loading cycles with the maximum stress as 90 percent of & Q _ and 

R = 0.1. Environment is Room Temperature (R.T.). 

2. 4300 loading cycles with the maximum stress as O’ 0 p and R = 0.7 - R.T. 

3. 260 loading cycles with the maximum stress as 95 percent of O' and 

R = 0.4 R.T. P 

4. 40 loading cycles with the maximum stress as d“ 0 p and R = 0.1 R.T. 

The cyclic life curves for 6A1-4V(STA) titanium for the environment of R.T. 
air are reproduced for R = 0.0, R = 0.1, R = 0.4, and R = 0.7 from 
Reference (10) in Figure D3. The difference between the plots of cyclic life 
against K^/Ki c for R = 0 and R = 0.1 is negligible for this material- 

environment combination, and hence both are shown by the same plot in 
Figure 03 . The threshold stress intensity level for the material in the 
environment of R.T. air is 90 percent of K^ c (10). 

The maximum possible K-, ./K-, , ratio that could exist in the vessel after the 
proof test at <5 0 p is I /ow"' v '= 0.667. From Figure 03, it can be seen from 

R = 0 plot tl*at the maximum cycles to failure is about 600 at cr op if the 

held times at maximum stress are small. If she analysis is based ou R = 0 
instead of actual R, the pressure-cycle history shows that the vessel is 
critical. In the following, the assessment of the vessel is cade based on 
the appropriate values of R. 

At the leginning of 200 loading cycles with the maximum stress as 0.90 d , 
t'"9 maximum K]_^_/Kj_ c is giver, by 0,90 X .667 = 0.60. This point is indicated 
by E on R = 0.1 curve. The 200 loading cycles cf 0.90 C and R = 0.1 

change the K^/K^c ratio from Point E to Point D on the plovof R = 0.1. 

The K}i/K-|_ 0 ratio at the end of 200 loading cycles of R = 0.1 is 0.63. 
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2.0 (Continued) 

The stress is increased by 10 percert at the end of 200 cycles. Hence, 
the Ktj/Kt* ratio at the beginning of 4300 cycles at tf and R = 0.7 is 
(1. 0/0.9) X 0.63 = 0.70. This is shown by Point D on the plot of R = 0.7, 
The 4300 loading cycles at C T p and ’R = 0.7 change the ratio from 

Point D to Point C on the plot*of"R = 0.7 where its value is 0.78. 

The stress is decreased by 5 percent at the end of 4300 cycles. Hence, the 

Kij/K^c ratio at the beginning of 260 cycles at 0.95 <? op is (0.95/1.0) X 

0.78 = 0.74 which is shown by Point C on R = 0.4 plot. The 260 cycles 
at 0.95 O op and R = 0.4 change K^^/K^ c ratio from Point C to Pbiht B on 
R = 0.4 where its value is 0.80. 

The stress is increased by 5 percent at the end of 260 cycles. Hence, the 

Kif/Kic raoio at the beginning of 40 cycles at O is (1.0/0.95) X .80 = 

0.84 which is illustrated by Point B on R = 0.1 plot. The 40 cycles at O’ 
and R = 0.1 increases hii/% c ratio from 0.84 to 0.875 which is shown 
by Po:‘ ut A in Figure D3. 

Since the stress intensity at the end of 40 cycles at O',™ is less than the 
threshold stress intensity, the vessel is considered to be safe for the 
flight. It will take 20 loading cycles at O'-p and R = 0.1 to increase 
Kii/Ki c from 0.875 to 0.90. Thus, the estimated minimum cyclic life 
remaining for the vessel is 20 cycles. 
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3.0- PREDICTION OF CYCLIC LIFE FOR A THIN -WALL ED VESSEL ' 

3.1 BACKGROUND 

Analysis for the prediction of the cyclic life for a thin-walled vessel is 
somewhat different than that for the thick-walled vessel. The flaw depth 
becomes deep with respect to the wall thickness prior to reaching the 
critical size for the thin-walled vessels. The stress intensity factor 
calculated by the Kobayashi equation for the deep flaw is higher than the 
one predicted by the original Irwin equation for the shallow surface flaw. 

As a result, the subcritical flaw-growth rates for the thin-walled vessels, 
having the same flaw size and subjected to the same stress as the thick- 
walled vessels, are higher than those for the thickr-yalled vessels. Thus, 
the total cyclic life for a thin-walled vessel is shorter than that deter- 
mined from curves of the type shown in Figure D4 andDB, that are developed 
from the data of specimens where a cr /t is less than 0.5. If data similar 
to that in Figures 54 and55 (K^/K^ 0 against cycles to failure and 
Kii/Ki c versus time to failure; can be developed from the specimens having 
deep flaws and the comparable thickness as that of the vessel, then the 
analysis described in Section 2..0 can be used to predict the cyclic life 
of the thin-walled vessel remaining after the proof test. This data 
development is complicated and expensive since the stress intensity magnifi- 
cation factor for deep surface flaws, is the function of a/t as well 
as a/2c. (Variation of c/ CTy S has a smaller effect on Hr than the variations 
of a/t.) Consequently, a large number of specimens would be required to 
sort out tne effect of a/t and a/2c. In the absence of these data, the follow- 
ing analysis is used to calculate the cyclic life. The main assumptions 
involved in the analysis are: 

1. In the thin-walled vessels, the flaws are long with respect to their depth 
and consequently, Q is assumed to be equal to unity in the Kobayashi 
equation. This, in turn, raises stress intensity and hence the flaw growth 
rates and gives the lower bound of the cyclic life. 

2. The flaw growth rates are dependent on K^/K-, c and hence, flaw growth 
rates obtained from the specimens where & cr /t is less than 0.5 can be 
used for the specimens where a cr /t approaches unity. 

3. It is assumed that below the threshold level, flaw growth rates are not 
affected by the presence of the propellant. Consequently, the flaw 
growth rates for the material-propellant temperature combination are 
simulated by the material- temperature combination. 

To determine the cyclic life of a thin-walled tank, the following relations 
are required: 

1. The proof test factor, y , K lc and K TH . 

2. O versus "a" curve, similar to Figure 1)6, for K, and Kjjj to determine 
the flaw sizes a^, a c ~, and a^. The O versus^a" curve can be 
obtained from the following equation: 
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3.1 (Continued) 

<5 = K^/CL.l MjjVrTa) 

3. K^/K^ versus flaw growth rate da/dN curve to determine flaw growth 

rate at any stress intensity level. 

The flaw growth rates can be obtained by differentiating the K li /K^ c versus 
cycles to failure curve, similar to that of Figure!)?. This curve is 
obtained from the specimens where a cr /t is less than half. For an assumed 
nw Tlimm cyclic stress level, say the given K-, ^/Kp c versus N curve can 

be converted to an a/Q versus N curve by the equation: 


a/Q 


1 

1.21 Tt 



2 


The slope of a/Q versus N curve gives the plot for the flaw growth rate 
d/dN (a/Q) versus K]_j/K]_ c for the stress level O’ 

From the above equation for a given K-j^, a/Q at the stress level O’ 2 is 
related with a/Q at C7. as: 

(a/Q) <s 2 = ( ^ ) 2 ( — fox 

From this equation, it can be concluded that the flaw growth rate at any 
stress level O' - is related to the growth rate at O' -1 as follows: 

■A 

(d/dN (a/Q) )^ = ( x / 2 ) 2 (d/dN (a/Q) ) a% 

This stress level effect is supported by the experimental data in References 
(7)> (8), (10), and (11). If the basic K^/K^ versus cycle data is 
obtained from the experimental tests where the specimens are cycled at a 
maximum stress at or near the expected operating stress levels in the 
vessel, the effect of stress level need not be considered. The flaw growth 
rate obtained in this manner from Figure 7 for 5A1-2. 5Sn(ELl) titanium for 
the maximum cyclic stress level of 139 ksl is given in Figure 1)7. Also, as 
pointed out by Tiffany, et al (7), flaw growth rates can be approximated by 
measuring striati^n spacings on electron fractogr: phs obtained from the 
fracture face of a surface flawed specimen cycled to failure in tension. 

3.2 APPROACH 

Knowing the proof stress and K 1c , the maximum possible flaw size that can 
exist ao ( t .fler the px-ocf test : ssuming rapid depressurization) cm 

be determined from the plot of O' against "a" for iL. This flaw size is 
denoted by in the illustrative example of Figure j’< 3. Also knowing (7 
:.Tii Ki,,, the r. xirr possible flaw size that can exist at C?' can be 
determined from the same plot for K-, . This flaw size is shown by a cr in 
Figure bo. Similarly, the maximum flaw size that could exist at C~ 0 - and the 
threshold stress intensity is shewn by a^. 
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3.2 (Continued) 

If the cycles to be applied to the vessel have short hold times at the 
maximum stress 0 p, then the stress intensity at 3" 0 p can be allowed to 
reach the critical value Ki<j. In this case, the flaw growth rates for the 
vessel are arithmetically integrated using the stress intensity magnifica- 
tion values from Figure D9a to calculate the number of cycles required to 
grow from to a c r. The relatively simple procedure for this integration 
is illustrated in Figure DIO. If a cr is less than the wall thickness, then 
the total estimated cycles to failure will be obtained, and if it exceeds 
the wall thickness then .the total estimated cycles to look will be obtained 
as explained in Section 2.4.2, (5). The effect of deep flow stress intensity 
magnification on predicted critical flaw sizes for a typical tank material is 
shown in Figure D9b, for both thick and thin-walled vessels. 

If the cycles to be applied to the vessel have long hold times at the maxi- 
mum stress, the stress intensity could not be allowed to exceed the sustained 
stress threshold value Ktr* In this case, the flaw growth rates are arith- 
metically integrated using to calculate the number of cycles required to 
grow from a^ to a^. This is the procedure followed in the prediction of 

the cyclic life in Volumes II and III of (5). 

The prediction of the remaining cyclic life and the structural integrity of 
the thin-walled vessel can best be demonstrated by an illustrative example. 

3.2.1 Thin-Walled Vessel - Illustrative Example 

Suppose a thin-walled 6A1-4V titanium (STA) propellant tank containing NgO, 
at R.T. is successfully proof tested with water at R.T. to a proof test 
factor of 1.41 x the maximum design operating stress, tfop* Suppose the 
proof tested tank is subjected to the following pressure cycles before the 
flight. 

1 . 20 loading cycles with the maximum stress as 90 percent of < 5 q p* 

2. 12 loading cycles with the maximum stress as 95 percent of <T 0 p. 

3. 5 loading cycles with the maximum stress as & op . 

It is desired to assess the structural integrity of the pressure vessel from 
the fracture mechanics "iandpoint and estimate the minimum cyclic life remain- 
ing for the vessel at o op. This example is treated with specific numbers 
since uhe stress ^ntensiuy factor has to be corrected for a/'t ration according 
to Figure D9a. The thickness of the tank is 0.022". The maximum design oper- 
ating stress, Copj is £7.5 KSI. The material of this gage under the a' ove- 
mentior.cd environmental conditions has the minimum fracture toughness of 37 ksi 
N'ln Ui.u the threshold sireus intensity cf SO percent if K^ c . 


The CT versus "a" plots are given for K]~ and = 0.S0 K]_c in Figure I/T.- 
Since proof sirens Is 1.41 x C* 0 - = 123.6 KSI, it is clear from Figure IF thai 
the maximum possi' le that coulcl exist is 0.0143". Here it is assumed that 
the depressuriz-- in from the proof pressure is rapid enough so that no signifi- 
cant flaw r rrowt' .curs during the depressurization. Also, as shown in Figure 


D8, for the stress level of 


is 0.0196" end a TH is 0.0160". 


SHEET D-303 


U 3 4007 1 4 i 4 t V . 0.05 



USE FOP TYPF WRITTEN MATERIAL ONLY 


TMt 


COMPANY 


NUMBER D2-1 19062-1 
REV ITR 


3.2.1 (Continued) 

The plot of K]_i/K lc versus flaw growth rate for 6A1-4V titanium at R.T. is 
reproduced in Figure Dll for <5 = 100 ksi from Reference 10. The 99$ 
confidence level flaw growth rate curve is used in the calculation of 
cyclic life. Since the above flaw growth rate curve is obtained from the 
cyclic data of R = 0.0, it is assumed in this example that all the 

cycles are applied at R = 0.0. 

Taking the effect of stress level on the flaw growth rates into account, 
flaw growth rates are arithmetically integrated from a^ = 0.0143" to 
a cr = 0.0196" according to Figure DIO to calculate the cycles to failure 
for tne stress level of O’ . The plot of flaw depth against cycles to 
failure for the stress levil of O^p is shown in ‘Figure M2. 

Whan the maximum cyclic stress is 0.95 O'™, ai is still 0.0143" but a cr is 
0.0208" and a^H = 0.0167" from Figure D 3* Based on the stress level of 
0.95 0 p, the flaw growth rates are integrated from a^ = 0.0143" to 

a cr = 0.208" to calculate the cycles to failure. Similar procedure is 
followed to obtain the relation of flaw depth against cycles to failure for 
the stress level of 0.90 ^ 0 p. These plots are shown in Figure DI2. 

At the end of the proof cycle and the beginning of the first cycle at the 
maximum cyclic stress of 0.90 O™, the maximum possible flaw depth is 
0.0143". This is shown by Point D in Figure D12. The 2C loading cycles with 
the maximum stress as 0.90 O'™ change "A" from Point D to Point C on the 
plot of 0.90 O' op as shown in Figure D12. • 

The tank wall stress is increased by 5 percent at the end of 20 loading 
cycles with the maximum stress as 0.90 CT 0 p. The flaw size remains the same 
during the stress increase. This is shown qy Point C on the plot of 0.95 <7 nn 
in Figure DU. P 

The 12 loading cycles with the maximum stress as 0.95 <5" op change "A" from 
Point C to Point B on the plot of 0.95 O 0 p in Figure DiZ.. 

At the end of 12 loading cycles with the maximum stress as 0.95 O 0 p, the 
stress is increased by 5 percent. This is shown by Point B on the plot of 
O op in Figure V i«. 

The 5 loading cycles with the maximum stress as O’op change "a" from Point B 
to Point A on the plot of o 0 p in Figure DI2. The flaw depth at A is 
0.01534". This is smaller than a^u which is 0.0160". Hence the vessel is 
considered to be safe for the flight. Also from Figure Di2, it will take 7 
cycles at O c p, to increase the flaw depth from 0.01534" to 0.0160". Hence, 
the lidui-i-uu cs Limited cyclic life remaining for the vessel is 7 cycles. 
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4.0 EXPERIMENTAL JUSTIFICATION FOR TECHNICAL APPROACH . 

The technical approach taken in Sections 2.0 and 3.0 would need the 
justification in the following areas: 

1. Represents tion of cyclic life with K]_^/Ki c . 

It has been shown ( 6 , 7 , 9 -, 10 , 11 ) that the cyclic life of surface flawed 
specimens correlates well with the maximum initial stress intensity at 
the tip of the surface flaw. Also in Reference 10, large number of curface 
flawed specimens of the same thickness are cycled to failure at four 
different stress levels ranging from 96 ksi to 126 ksi. The results, 

Kii/Kic against cycles to failure , are cited in Figure D13. This shows that 
for a given Kii/K^q, the stress level has little real influence on the 
cyclic life. 

2. Use of uniaxial specimen data in the prediction of the cyclic life of 
biaxially loaded pressure veseel. 

The cyclic life data obtained from the preflawed 5Al-2.5Sn(ELl) titanium tank 
tests agree very well with the corresponding cyclic life data obtained from 
preflawed uniaxial test specimens at R.T., -320°F, and -423°F temperatures (7). 

The same reference also shows that cyclic life data obtained from 2219-T87 
aluminum tank tests at R.T. and -320°F temperature correlate very well with 
those obtained from uniaxial specimens. The stress intensity versus cycles 
to failure correlations for 2219-T87 aluminum specimens and tanks at R.T. and 
_320°F are recited from Reference 7 in Figures L.^r an^D15. Similar correlation 
is shown for Ladish D6A-C steel at R.T. in Reference ^6). These results 
indicate that the uniaxial plane strain cyclic life data and flaw growth 
rates car be applied directly to the prediction of the cyclic lives and flaw 
growth rates of the biaxially loaded pressure vessels where the flaws grow 
under plane strain conditions. 
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Figure D3: PREDICTION OF CYCLIC LIFE OF A THICK WALLED VESSEL 

( Illustrative Example ) 
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Figure 05 1 COMBINED SUSTAINED & CYCLIC STRESS LIFE DATA 
(5AI-2 1/2 Sn (ELI) THanlum @ -320°F) 
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Figure D9b. CRITICAL FLAW SIZE CURVES @ LOX TEMPERATURE 2219-T87 ALUMINUM 
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D12: PREDICTION OF CYCLIC LIFE OF A THIN WALLED VESSEL 
( Illustrative Example ) 










NUMBER D2-1 19062-1 



o o o o 


91 KDUI II . 

*/ C>) 

SHEET D-511 





NUMBER D2-1 19062-1 



Figure D14t STRESS INTENSITY VS. CYCLES TO FAILURE CORRELATION FOR 
2219-T87 ALUMINUM AT ROOM TEMPERATURE 
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Figure D15 STRESS INTENSITY VS. CYCLES TO FAILURE CORRELATION FOR 
2219-TC7 ALUMINUM AT -320 °F 
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Appendix E 

FAILURE MODE, EFFECTS, AND CRITICALITi ANALYSIS 


1.0 INTRODUCTION 

1.1 APPLICATION TO SAFETY ANALYSIS 

Failure Mode, Effects, and Criticality Analyses (FMECA) have been 
used for years as a method of determining the reliability of a 
system. The same method may be used to determine the degree of 
safety to be expected from a system. The adaptation of the FMECA 
to system safety analysis requires that a different perspective be 
adopted by the analyst. The goal of t. reliability analysis is the 
prevention of "loss of mission", "loss of system", and ''system 
function degradation". The goal of a system safety analysis is 
the prevention of "death or injury of personnel", "damage of the 
system", and "system saf ' J " ,r degradation". These system safety goals 
are achieved by considering every component failure mode, including 
improper commands to the component, which any have potentially 
damaging effects. A list of components which are critical to safe 
system use may be derived from the analysis, and the criticality 
(or probability of causing personnel injury or system damage) 
calculated for the appropriate failure modes. 

1.2 REFERENCES 

The material in this appendix has been chiefly extracted from 
Procedures g or Failure Mode. Effects. And Criticality Analysis 
(FMECA) . document number RA-006001 3-1 A , Office of Manned Space 
Flight, National Aeronautics And Space Administration, August 1966. 
Information on application of the FMECA method is also found in 
Procedure for Performing Systems Design Analysis . Drawing No.1 CM3 Dili, 
Revision A, George C. Marshall Space Flight Center, NASA, June 1964; 
and in Reliability Stress And Failure Rate Data For Electronic 
Ecu 5 rr rv* . MIL- IIDIiX -2' ~A , Bureau of Nr .1 Weapons, Department Of 
Defense, December 1965. 

1.3 SUMMARY DESCRIPTION OF FMECA 

1.3.1 Definition Of F-«ECA 

For system safely analyses, FMECA is a procedure which documents 
all peer ill e f-il "roc 5r. a syrter. dori -n within specified ground 
rules, determines failure nodi analysis, the effect of each failure 
on system operation, identifies single failure points critical to 
safety, an i rcf.ks each failure according to criticality cate'-ory <v' 
failure effect; and probability of occurrence. The total analysis 
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1.3.1 (Continued) 

is conducted in two steps: The Failure Mode and Effect Analysis 

(FMEA), and the Criticality Analysis. It has been found most 
practical to assume that the effects of each failure studied 
during the analysis are not negated by the occurrence of a 
benign failure. 

1.3.2 Objectives of Conducting FMECA 

The FMEA is accomplished to provide: 

a. The design engineer with a method of selecting a design 
with a high probability of safe operation, 

b. Early visibility of system interfact problems, 

c. Identification of single failure points critical to 
system safety, 

d. Early criteria for test planning, 

e. Qauantitative and uniformly formatted data input to the system 
safety prediction, assessment, or other safety study. 

1.3.3 Application Of The FMECA Method 

An FMECA should be initiated as an integral part of the early 
design phase of system functional assemblies. If a Gross Hazards 
Analysis has been conducted, the results can be used to guide 
the development of the FMECA. Subsystems which the Gross Hazards 
Analysis has indicated are most hazardous can be developed fi.st 
in the logic diagram for the failure mode and effects study. 

An FMECA should be performed at the highest system level feasible. 
This facilitates a safety criticality ranking of all of the 
major system elements so the FMECA effort can be allocated to 
those elements which are most determinant upon overall safety. 

Proposed design changes can be incorporated in the analysis, 
and the effect on system safety can be predicted. Changes which 
are proposed to enhance safety should be considered from all 
aspects to ensure that the modification is cost effective and 
that the state-of-the-art is reflected in the r.ew design. 

FMECA, like all analytical tools, can be conducted on 
completed systems. The increased cost of modifying a physical 
system : a a major determining factor for safety improvements. 

As a result, the improvement ^ recommended for completed 
systems must be very cost effective. Therefore, it is incumbent 
on the analyst to be as accurate as possible in the prediction of 
safety improvements so that safety costs can be fairly evaluated. 
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1.3.4 Procedure of FMECA 

FMECA is performed in two phases: (1) Failure Mode and 
Effects Analysis (FMEA), and (2) Criticality Analysis (CA). 

The combination of these two phases provides (3) Failure Mode 
Effects and Criticality Analysis (FMECA). Section 2 provides 
procedures for FMEA; Section 3 provides procedures for CA; and 
Section 4 combines the FMEA and CA into the FMECA. 
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2.0 PROCEDURE FOR FAILURE MODE AND EFFECTS ANALYSIS 

2.1 SYSTEM DEFINITION 

2.1.1 Accomplishment 

Accomplishment of an IMEA on a system consists of the following general 
steps: 

a. Obtain sill descriptive information available on the system to be 
analyzed. This should include such documents as functional block 
diagrams, system descriptions, specifications, drawings, system 
component identification coding, operational profiles, environmental 
profiles, and reports bearing on reliability and safety such as 
feasibility or reliability studies of the system being analyzed and 
of past similar systems. 

b. Construct a logic block diagram of the system to be analyzed, similar 
to that shown in Figure 2-1, for each equipment configuration involved 
in the system's use. 

The diagrams are developed starting at the top level of the system and 
extending downward to the lowest level of system definition at the time 
of analysis. These logic block diagrams are not descriptive block 
diagrams of the system that show the interconnection of equipments. 

The lcgic block diagrams used for an FMEA show the functional inter- 
dependencies between the system cc oonents so that the effects of a 
functional failure may be readily . aced through the system. 

All redundancies or other means for preventing failure effects should 
be shown as functional blocks or notes. 

Where certain functions are not required in an operational time phase, 
the information may be shown by a dotted block as in the case of 
component 0.5 in Figure E-l or by other suitable means. 

c. At the lowest level of system definition, as developed from the top cLovi, 
analyze each fail-are mode of the system component and its effect on 

the system. Where system functional definition has not reached the 
level of identification of the system functions with the specific type 
of hardware that will perform these functions, the FMEA should be based 
upon failure of the system functions giving the general type of hardware 
envisioned as the basis for system design. 

Four basic conditions of component or functional failure should be 
considered: 

1) Premature operation 

?) Frilure to operate at a prescribed time 
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2.1.1 (Continued) 

3) Failure to cease operation at a prescribed time 

4) Failure during operation. 

The FMEA assumes that only the failure under consideration has occurred. 
When redundancy or other means have been provided in the system to 
prevent undesired effects of a particular failure, the redundant element 
is considered operational and the failure effects terminate at this point 
in the system. When the effects of a failure propagate to the top level 
of a system and cause the system to fail, the failure is defined as a 
critical failure in the system. 

When an FMEA is being performed on a system which is already built, the 
analyst may find cases where redundancies or other means of preventing 
failure effects do little to improve the failure situation or where the 
redundancies may actually worsen it. These cases should be reported 
for the next higher level. Where the scope of the FMEA. program permits, 
the redundancy or other failure effects preventive means should not halt 
the continuation of the failure effects analysis toward the top level of 
the system. 

d. Document each potential failure mode of each system component and the 
effects of each failure mode on the system by completing an FMEA. format 
similar to that shown in Figure E-2. Instructions for filling out the 
FMEA format are given in Section 2.3. 

2.1.2 Input Documentation 

The following documentation is representative of the information required 
for system definition and analysis: 

2.1. 2.1 System Technical Development Plans 

To define what constitutes and contributes to the various types of system 
failure, the technical development plans for the system should be stud* ->d. 

The plans will normally state the system objectives and specify design 
requirements for operations, maintenance, test, and activation. Detailed 
information in the plans will normally provide a mission or operational 
profile and a functional flow block diagram showing the gross functions 
that the system must perform. Time ' * - grams and charts used to describe 
system lunctional sequence will aid the analyst to determine the tine 
feasibility ox' various means of faixure detection and correction in the 
operabi'ig system. Also r -cuir^i is a definition of the operatic .ul and 
er. .•iror.c.ar. c. .1 stresses that the system is expected to undergo and a list of 
the acceptable conditions of functional failure under these stresses. 

2. 1.2. 2 Trude-Off Study Reports 

To determine the possible and more probable failure modes and causes in the 
ry. l r ' sl’’dy reports should identify the e.rcj.5 of marginal dc-si-r. 

and should explain the design compromises and operating conditions agreed upor. 
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2.1. 2. 3 System Description and Specifications 

The descriptions and specifications of the system's internal and interface 
functions, starting at the highest system level and progressing to the 
lowest level of system development to be analyzed, are required for con- 
struction of the FMEA logic block diagrams. A logic block diagram as used 
in the FMEA and as described in Paragraph 2.1. l.b shows the functional 
interdependence within the system and permits the effects of a failure to 
be traced. System descriptions and specifications usually include either 
or both functional and equipment block diagrams that facilitate the con- 
struction of the logic block diagrams required for the FMEA. In addition, 
the system descriptions and specifications give the limits of acceptable 
performance under specified operating and environmental conditions. 

2.1. 2.4 Equipment Design Data and Drawings 

Equipment design data and drawings identify the equipment configuration 
performing each of the system functions. 

Where functions shown on a FMEA functional block diagram depend on a replace- 
able module in the system, a separate FMEA may be performed on the internal 
functions of the nodule. The effects of possible component failure modes in 
the module on module inputs and outputs then describe the failure modes of the 
module when it is viewed as a system component. 

2.1.2. 5 Coding Systems 

For consistent identification of system functions and equipment, an approved 
coding system should be adhered to during the ar.a?**sis. Use of coding 
systems common to the overall program are preferable. 

2.1. 2. 6 Test Results 

Tests run on the specific equipment under the identical conditions of use are 
desired. When such test data are not available, the analyst should collect 
and analyze the data obtained from studies and tests performed during current 
and past programs on equipment similar to those in the system and under 
similar use conditions. 

2.2 'LOGIC BLOCK BTAGRftM 

The next step of the FMEA procedure is the construction of a logic block 
diagram of the system to be analyzed. The general reliability logic block 
diagram scheme for a sysuem is shown in Figure E-l. This example system is 
for a sp_.ce vehicle st.ge, ar.d the notes given explain the functional 
dependencies of the stage components. 

A system component at any level in the stage system may be treated as a 
system and may be diagrammed in like manner for failure mode and effects 
analysis. The results of the component's FMEA would define the failure 
modes critical to the component's operation, i.e., those that cause loss of 

component i.g, : r. or outputs. Thor: failure rodeo will then be used to 
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2.2 (Continued) 

accomplish the FMEA at the next higher system level. This procedure 
ultimately leads to an FMEA for the stage, the space vehicle, and space 
system. 

All system redundancies or other means for preventing failure effects are 
shoun in the logic block diagram. This is because in single failure analysis, 
when a means exists to prevent th'e effects of a failure, the failure cannot 
be critical above the system level where the preventive means is effective. 



2.3 FAILURE MODE AND EFFECTS ANALYSIS 


The FMEA and its documentation are the next steps of the procedure. These are 
accomplished by completing the columns of an FMEA format similar to that 
given in Figure E-2 as follows: 


Column 

Number Explanation or Description of Entries 


( 1 ) 

( 2 ) 

(3) 

(4) 


Name of system function or component under analysis for 
failure modes end effects. Breakdown of a system for 
analysis should normally be down to the lowest practicable 
level at the time of the FMEA. In speeded cases such as 
electronic systems using integred modular units as system 
building blocks, the modules may be listed rather than listing 
its parts. 

Drawing number by which the contractor identifies and 
describes each component or module. These drawings should 
include configuration, mechanical, and electrical 
characteristics . 

Reference designation used by manufacturer to identify the 
component or module on the schematic. Applicable schematic 
and wiring drawing numbers should also be listed. 

Identification number of FMEA logic block diagram and of 
the function. 





(5) Concise statement of the function performed. 

(6) Give the specific failure mode after considering the four 
basic ^ciiure conditions: 

1) Prerr.a ture operation, 

2) Failure to operate at a prescribed time. 

3) Failure to cease operation at a prescribed time. 

4) Failure during operation. 

Pn r e"ch app" 1 5 "able failure mode, describe the cause 
including operational and environmental stress factors 

if kr.ox.7i. 
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2.3 (Continued) 

Column 

Number Explanation or Description of Entries 

(7) Phase of mission in which critical failure occurs, e.g., 

Prelaunch: checkout, countdown; Flight: boost phase, 

earth orbit, translunar, lunar landing, etc. Where the 
subphase, event, or time can be defined from approved 
operational or flight profiles, the most definitive timing 
information should also be entered for the assumed time of 
critical failure occurrence. The most definitive time 
information that can be determined should also be given for 
the failure effects under the columns titled "Failure 
Effects On." 

(8) A brief statement describing the ultimate effect of the 
failure on the function or component being analyzed. 

Examples of such statements are component rendered useless, 
component’s usefulness marginal, or structurally weakened 
to unacceptable reliability level. Timing information as 
described under (7) should be given. 

(9) A brief description of the effect of the failure on the next 
higher assembly. Timing information as described unde 1 ' 

(7) should be given as to time of failure effect. 

- (10) A description of the effect of the component failure on the 

system. For the major systems of the overall space system, 
these effects are divided into failures affecting' equipment 
safety and failures affecting personnel safety. Examples 
of failures affecting equipment safety are vehicle loss, 
stage damage, etc. Examples of failures affecting personnel 
safety are loss of crew, abort during flight, and loss of 
redundancy in safety systems. For lower level systems where 
effects on the overall spaco system are unknown, the eriecus 
of a failure on the system tinder analysis may be described 
as loss of system inputs or outputs. Examples of such 
effects are loss of signal output, loss of output pressure, 
ar.d snorted power input. Timing information as described 
under (7) should be given. 

(11) A description of the methods by which the failure cc jld be 

detected. Identify which of the following categories the 
failure detection means falls under: 


1 ) 

2) 

3) 

4) 

5) 


On-board visual/audible warning devices. 

Automatic abort-sensing devices. 

Ground operational support system failure-sensing 
Instrumentation. 

Flight telemetry, ground support equipment cons le 
None 
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2.3 (Continued) 

Column 

dumber Explanation or Description of Entries 

(11) Timing information as described under (7) should be given 
Contd. with respect to the reaction time available between time 

of component failure, time of detection, and time of 
critical failure effect. 

(12) A description of what corrective actions that the flight 
crew and the ground crew could take to circumvent the failure. 
If applicable, the time available for effective action and 
the time required should be noted. 

(13) State the useful life of item under given environmental 
conditions. 


SHEET £-207 



System FAILURE MODE AND EFFECTS ANALYSIS Page of Pages 


i ^ * 

4- - Vv - 




NUMBER D2-U9062-1 
REV ITR 


(13) Useful Life 


Corrective 
Action Time 
(12) Ava ilable/Time 
Required 


Failure Detection 
(11) Method 


(10) 

System 

(9) 

Subsystem 


Component/ 

(8) 

Functional 


Assembly 


Mission 
(7) Phase 


(6) Failure Mode 
and Cause 


(5) Function 


Reliability 

Lc-;c 

Diagram 

Hur.be r 


Braving 

Reference 

Designation 


Identification 

Humber 
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3.0 PROCEDURES FOR CRITICALITY ANALYSIS 

3.1 CRITICALITY PROCEDURE 


The Criticality Analysis (CA) determines a system component's magnitude of 

criticality to system safety. 

The CA is performed in two steps: 

a. Identify critical failure modes of all components in the FMEA for each 
equipment configuration in accordance with the categories listed in 
Paragraph 3.2. For FMEA's of lover level systems where the effect of 
failure modes on mission success or crew safety cannot be determined, 
the critical failure modes vill be those that cause failure of one or 
more of the system's inputs or outputs. 

The specific ’type of system failure is expressed as a unique loss state- 
ment. For major Apollo systems, example loss statements' 'are crew loss, 
abort, and vehicle loss. For lover level systems, example loss state- 
ments are output signal loss, input power shorted, and loss of output 
pressure. 

b. Compute Critical Numbers (Cj.) for each system component with critical 
failure modes. The method is given in Paragraph 3.3, and a format for 
the data is shown in Figure E-3. 

The C_ for a system component is the number of system failures of a 
specific type expected per million missions due to the component's 
critical failures modes. 

Where the factors i /olved in the calculation of system component criti- 
cality numbers vary with mission time, the mission is divided into mis- 
sion phases such that the change in the factors are negligible during 
each phase. A criticality number is computed for each mission phase for 
a given loss statement. 

The analyst responsible for the CA at the next higher system level con- 
tinues the anaxysis using lover level CA's. Where the loss of an input 
or output of a lover level equipment is critical to equipment operational 
success at his system level, action should be taken to design the criti- 
cality out of the system or to reduce its criticality to an acceptable 
level by improvements in basic reliability, redundancy, or other means. 

3.2 CRITICAL FAILURE MODE IDENTIFICATION 

The first step of CA is the identification of critical failure modes from the 

FMEA's on the system. 
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3.2 (Continued) 

Critical failure nodes at higher levels in the overall space system should 
be identified according to approved nonambiguous loss statements. The 
following categories may be used: 


Category 1 - 


Category 2 - 


Category 3 - 


HARDWARE CRITICALITY CATEGORIES 

Hardware, failure of which results in loss of life of any 
crew member. This Includes normally passive systems, i.e., 
emergency detection system,, launch escape system, etc. 

Hardware, failure of which results in damage to the system but 
does not cause loss of life. 

Hardware, failure of which will not result in system damage nor 
cause loss of life. 


At the lower system level where it is not possible to identify critical failure 
modes according to loss statements under the categories above, approved loss 
statements based upon loss of system inputs or citputs should be used (See 
Paragraph 3.1.a). Kennedy Space Canter loss statements can be found in NASA. 
Kennedy Space Center Publication KfC-STD-llB(D) , 3 February 1965, "Failure 
Effect' Analysis of Ground Support Equipment". Marshall Space Flight Center 
loss statements can be found in NASA Marshall Space Flight Center Drawing 
No. 1CM30111, Revision A, 26 June 1964, "Procedure for Performing Systems 
Design Analysis". 

The log? statement used to identify a critical failure mode, in a system should 
be prefixed with the word "actual", "Probable", "possible", or "none" which 
represents the analyst’s judgment as to the conditional probability that the 
loss will occur given that the failure mode has occurred. 

3.3 CRITICALITY NUMBER CALCULATION 

The second step of the CA procedure is the calculation of Criticality Numbers 
(C r ) for -he system components with critical failure modes. 

A Cj» for a system component is the number of system failures of a specific 
type expected per million missions due to the component's critical failure 
modes. The specific type of system failure is expressed by the critical 
failure mode loss statement discussed in Paragraph 3.2. 

For a particular loss statement and mission phase, the Cj. for a system compo- 
nent with critical failure nodes is calculated with the following formula: 


it 

: r = / <»« W ; 10 “’n 

U aj 


n ~ 1, 2, 3, . • • , j 
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(Continued) 


>%A 


-jpKnj 


Cp = Criticality number for the system component, 

j = Total number of critical failure modes in the system component 
under loss statement. 

(6 = Conditional probability that the failure effects of the critical 
failure mode occur given that the critical failure mode has 
occurred. 

<* = Fraction of all failures (or 1 q) experienced by a component and 
that are due to the particular failure mode under consideration. 

Kg = Environmental factor which adjusts Aq for difference between 

environmental stresses when Xq was measured and the environmental 
stresses under which the component is going to be used. 

= Operational factor which adjusts Aq for the difference between 
operating .stresses when Aq was measured and the operating stresses 
under which the component is going to be used. 

Aq = Generic failure rate of the component in failures per hour or cycle. 

t = Operating time in hours or number of operating cycles of the 
component. 

n = An index of summation for critical failure modes in the system com- 
ponent that fall under a particular loss statement. 

The factor <9 is the probability of less discussed in Paragraph 3.1, and 
should be limited to the following values: 


Failure Effects 


Value of Beta 


Actual Loss 


100 Percent 



Probable Loss 


Possible Loss 


Greater than 10 Percent to 
100 Percent 

0 Percent to 10 Percent 
0 Percent 


The expression (&<\KpK A A.-;t • 10^) is the portion of C r for the component due 
to one of its critical failure modes under. a particular loss statement* 

After calculation of the part of C r due to each of the component's critical 
failure modes under the loss statement, these parts are summed for all 

critical failure tned-.-s as ir. iio-l.d by: 
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3.3 (Continued) 


£ 

n=l 


A failure mode failure rate Is represented In the formula for C r by the 
product of thj term, o {, Kg, K A , andAg. These terms should be replaced by- 
actual failure mode failure rates determined from the test prc.'ram as they 
become available. A sample calculation is given below. ~ 

3.3*1 Cj, Calculation Example 
For a given mission phase: 

Given: System component with A^ = 0.05 failures per 10^ operating hours, 

K a = 10, K E = 5° j 

d. = 0.30 for one critical failure mode under loss statement, and 

As 0.20 for the second critical failure mode under the same loss 
statement. 

Let = 0.50 and t - 10 hours. 

Find: Cj. for this system component during this mission phase. 

Solution: 

For the first critical failure mode; i.e. , for n * 1 

(^KgK A A G t * 10 6 ) 1 s (C. 50) (0.30) (50) (10) (0.05 X 10" 6 )(10)(10 6 ) = 38 

For the second critical failure mode; i.e., for n = 2 

(£* KgK A G t • 10°) 2 = (0.50) (0.20) (50) (10) (0.05 X 10-6)(10)(10 6 ) *• 25 

j = 2 and 

2 
C. 

i 

n=l 


2 

= y (0*VaV • 10 6 ) n = 38 + 25 = 63 


3.3.2 Format for C r Calculation 


.» v *- - 


The columns of the format for C r calculations shown in Figure E-3 should be 
filled out as follows: 
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3.3.2 (Continued) 


Column 

Wijmber Rrml tjon or Description of Entries 

(l) (7) These columns duplicate the information given in the same 

columns of the fMEA. format shown in Figure E-2 and are 
explained in Paragraph 2.3. 


( 8 ) 


Failure effects given for the highest system level on the 
FMEA.. 


(9) 


The source of reliability information used for each calcula- 
tion should be identified in this column. 


(10) - (16) Enter the information required for the calculation of the 
portion of the component * s criticality number due to each 
of its critical failure modes. 


(17) Enter the component's criticality numbers in this column. 

This is the sum of the portions of the criticality number 
entered in column (16) due to a particular mission phase 
and loss statement. 
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(17) 

Component 
Criticality 
Number, C r 

(16) 

Critical 
Failure . 
Mode 

Contribution 

(15) 

Operating 

Time 

Hours 

or Cycles t 

(14) 

Generic Failure 
Rate Failures/ 
Hour or Cycle 

(13) 

Operational 
Ratio K a 

(12) 

Environmental 
Ratio K e 

(11) 

Failure Mode 

Ratio <X 

(10) 

Probability 
of Failure 
Effects (3 

(9) 

Reliability Data 
Source Code 

(8) 

Failure Effects 

(7) 

Mission Phase 

(6) 

Failure Mode 
ar.d Cause 

(5) 

Function 

(4) 

Rel. Logic Diagram 

Kur.ber/Fi ‘net ion 
Number 

(3) 

Drawing 

Reference 

Designation 

(2). 

Identification 

Number 

(1) 

Nar.e 

• > 
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4.0 SUMMARY OF FMhA ARP CA 

4.1 PREPARATION OF FMECA SUMMARY 

The procedure is a method for combining the criticality values by mission 
phase to develop an overall summary. 

Preparation of the JMECA summary is developed from the FMEA and CA analysis 
discussed in Sections 2 and 3 and is accomplished by completing a form 
similar to that given in Figure E-4. Instructions for completing the form 
are given below. 

A criticality list is prepared. Critical system components are grouped 
according to loss statement and are listed in the groups in descending order 
according to the magnitude of their total criticality number for the particular 
loss statement. A system component's total criticality number for a particular 
loss statement is computed from the FMECA summary information. Ex a mpl es of 
ground rules for this are given below. 


A general FMECA summary form is shown in Figure E-4. 
completed as follows: 


The columns are 


Column 

Number 

(1) - (5) 


Explanation or Description of Entries 

Identification and function of the system component with 
critical failure modes is the same as are those for the 
FMEA format in Figure E-l which is described in Paragraph 
2.3. 

(6) For each system component, enter its critical failure 
modes and, if known, their. cause. 

(7) - (9) If the critical failure mode has an’ effect during Phase I 

of the mission, its effect on the s^o-em is given in 
Column (7) with mission time or event. The approved loss 
statement for the effect is given in Column (8). The 
portion of the total criticality number calculated for the 
critical failure mode according to the example given in 
Paragraph 3.3.1 is entered in Column (9). 

(10) - (12) Where the critical failure mode has r.n effect during 

Phase 2 of the mission, Columns (10)-fl2) are completed 
in the same manner as in Columns (7)-(9). This format should 
be extended to include all mission phases. 

(13) A total criticality number may be computed for each system 

component according to approved ground rules. An example 
of ground rules is as follows: 
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4*1 (Continued) 


Column 

Number Explanation or Description of Entries 

(13) a. Each criticality number in the mission pha. e columns 

Contd. shall be multiplied by an approved importance 

weighting factor for its particular loss statement. 

Example for stage/module level iWECA: Kills Crew = 

1.0, Damages Vehicle = 0.5, Precludes Escape = 0.4, 
Loses Protective Devices = 0.3. 

Example for subsystem level FMECA: Loss of critical 

output or input which could lead to crew loss = 1.0, 
Loss of noncritical input or output =0.2, Annoyance 
failure = 0.1. 

These examples are given only to convey the intent. A 
lengthy list of statements of actual loss may be ranked 
in relative importance by this means. 

b. A given critical failure mode in a system component 
shall occur only once during the mission, assuming no 
repair j therefore, the largest weighted criticality 
number for a critical failure mode will be selected 
from among the mission phase columns for calculation 
of the component's total criticality number. 

c. A component's total criticality number for a particular 
loss statement shall be the sum of the weighted 
criticality numbers with the same loss statement 
selected from the mission phase columns according to 
ground rule b, preceding. 

d. Each total, criticality number with loss statement for 
a system component as calculated by ground rtiTe c, 
above, shall be entered in Column (13) of the FMECA 
summary format. 

4.2 CRITICALITY LIST 

The last step of the FMECA is the preparation of the criticality list. 
Critical system components are grouped according to loss statement and are 
listed in the groups in descending order according to magnitude of their 
to til critic-ilio" number for the loss statement, A system component may 
appear in more than one of the groups. Appropriate supporting information 
and recommendations should be given for each of the listed components. 




tm* 


COMPANY 


NUMBER D2-119C62-1 
REV LTR 





tt 

m' 

;8t 

a. 

Q> © 
bO-P 

«2 <§ «* 

K 

1 

•P 

r: w 

© r>* 

-P CO 

C '* X-i 

*s & 

System 

/1 0 \ Component 
Total 

Criticality No. 


Mission Phase Criticality 

Phase 2 

Criticality 
t 12 ' Number 


, . v Loss 
K ‘ L ' Statement 


floT failure 
V ' Effect 


Phase 1 

/_ » Criticality 
w Number 

•P 

O 

Vn 



' L08 T- 

' 5 ' Statement 

tr 

d 

- ■ ■ 

f Failure 

< 7 • Effect 

O 

r-t 

05 

-- - - - - - - - _ u 

Failure Mode 
and Cause 

© 

a 

8 

(5) Function 

m 

© 

__ _ ____ ?n 

Item Identification 

Reliability 
. . Logic Diagram 
Nnr.fcer/Function 
Number 

•H 

P*4 

* 

Drawing 

(3) Reference 
Designation 


Identification 
v ' Number 


(1) Name J 

* 


SHEET E-403 











, 




tMi 4TA£/>V^ commnt 


NUMBER D2-119062-1 
REV LTR 


LIMITATIONS 


This document is controlled hy 5-8231 KSC TIE Systen Safety 


All revisions to this document shod be approved by the 
above noted organization prior to release. 


US 4602 use REV f/M 


SHEET 1001 







\ 

♦ <* 


' * 





THf 


COMPANY 


NUMBER D2-1 19062-1 
REV LTR 



6 


i 


ACTIVE SHEET RECORD 


SHEET 

NUMBER 

n 

ADDED SHEETS 

SHEET 

NUMBER 

“I 

ADDED SHEETS 

REV LTR 

SHEET 

NUMBER 

REV LTR 

SHEET 

NUMBER 

REV LTR 

REV LTR 

SHEET 

NUMBER 

REV LTR 

SHEET 

NUMBER 

REV LTR 

1 

2 

3 

4 

5 

6 

7 

8 

1-0 

1-1 

1-2 

1-3 

1-4 

1- 5 

2- 0 
2-1 
2-2 
2-3 
2-4 

2- 5 
2-6 

3- 0 
3-1 
3-2 
3-3 
3-4 
3-5 
3-6 
3-7 

3- 8 

4- 0 
4-1 
4-2 
4-3 
4-4 
4-5 
4-6 
4-7 
4-8 
4-9 
4-10 
4-11 
4-12 
4-13 
4-14 

! 

! 


1 


4-15 

4-16 

4-17 

4-18 

4-19 

4-20 

4-21 

4-22 

4-23 

4-24 

4-25 

4-26 

4- 27 

5- 0 

5- 1 

6- 0 
6-1 
6-2 
6-3 
6—4 
6-5 
6-6 
6-7 
6-8 

A-001 

A-002 

A-101 

A-201 

A-301 

A-401 : 

A-402 

A-4C3 

A-501 

A-502 

B-001 

B-002 

B-003 

BI-101 

BI-102 

BI-201 

BI-301 

BI-3C2 

BI-303 

BI-304 







UJ 4602 1416 OPIG. 4/65 


SHEET 1002 





NUMBER B2-1 19062-1 

COMfBANV 

REV LTR 


ACTIVE SHEET RECORD 


ADDED SHEETS 


SHEET £ 
NUMBER > 


BI-305 

BI-306 

BI-307 

BI-308 

BI-30Q 

BI-3.0 

BI-311 

BI-401 

BI-402 

BII-100 

BII-101 

BII-102 

BII-103 

BII-104 

BII-105 

BII-106 

BII-201 

BII-202 

BII-203 

BII-204 

BII-205 

BII-206 

BII-207 

BII-208 

BII-209 

BII-210 

C-001 

C-002 

C-003 

C-101 

C-102 

C-103 

C-104 

C-105 

C-106 

C-201 

C-202 

C-203 

C-204 

C-205 

C-206 

C-207 

C-208 


SHEET 


SHEET 


NUMBER NUMBER UJ 


ADDED SHEETS 


SHEET “ “ 

NIJMRFR t SHEET t SHEET 

NUMBER > NUMB£R > NUm beR 


C-209 

C-210 

C-211 

C-212 

C-213 

C-214 

C-215 

C-216 

C-217 

C-218 

C-219 

C-220 

C-221 

C-222 

C-223 

C-224 

C-225 

C-226 

C-227 

C-228 

C-229 

C-230 

C-231 

C-232 

C-233 

C-234 

C-235 

C-236 

C-237 

C-238 

C-239 

C-240 

C-241 

C-242 

C-243 

C-244 

C-245 

C-246 

C-247 

C-248 

C-249 

C-250 

C-251 

C-252 


Uj 4802 1436 OHIO. 4/6$ 


SHEET 1003 


REV LTR 





















T' 


NUMBER D2-1 19062-1 

TML COWMNY 


ACTIVE SHEET RECORD 




added sheets 



ADDED S 

HEETS 


SHEET 

NUMBER 

D-001 

oc 

> 

UJ 

oc 

SHEET 

NUMBER 

OC 

> 

UJ 

oc 

SHEET 

NUMBER 

OC 

►— 

> 

UJ 

oc 

SHEET 

NUMBER 

oc 

♦— 

—J 

> 

LU 

OC 

SHEET 

NUMBER 

OC 
— J 

> 

UJ 

oc 

SHEET 

NUMBER 

OC 

H- 

—1 

> 

UJ 

a: 

m 

D-102 

D-201 

D-202 

D-203 

D-301 

D-302 

D-303 

D-304 

D-401 

D-501 

D-502 

D-503 

D-504 

D-505 

D-506 

D-507 

D-508 

D-509 

D-510 

D-511 

D-512 

E-001 

E-002 

E-003 

E-101 

E-102 

E-103 

B-201 

E-202 

E-203 

E-204 

E-205 

E-206 

E-207 

£-208 

E-301 

E-302 

E-303 

E-304 

£-305 

E-306 

1 

1 


1 


E-401 

£-402 

£-403 

1001 

1002 

1003 

1004 

1005 


1 



■i 


U> MO t t«M OHIO. «/•> 


SHEET 1004 





